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Abstract 

Game theory is one of the key paradigms behind many scientific disciplines from biology to behavioral sciences to 
economics. In its evolutionary form and especially when the interacting agents are linked in a specific social network 
the underlying solution concepts and methods are very similar to those applied in non-equilibrium statistical physics. 
This review gives a tutorial-type overview of the field for physicists. The first four sections introduce the necessary 
background in classical and evolutionary game theory from the basic definitions to the most important results. 
The fifth section surveys the topological complications implied by non-mean-field-type social network structures in 
general. The next three sections discuss in detail the dynamic behavior of three prominent classes of models: the 
Prisoner's Dilemma, the Rock-Scissors-Paper game, and Competing Associations. The major theme of the review is 
in what sense and how the graph structure of interactions can modify and enrich the picture of long term behavioral 
patterns emerging in evolutionary games. 
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1. Introduction 



Game theory is the unifying paradigm behind many scientific disciplines. It is a set of analytical tools 
and solution concepts, which provide explanatory and predicting power in interactive decision situations, 
when the aims, goals and preferences of the participating players are potentially in conflict. It has successful 
applications in such diverse fields as evolutionary biology and psychology, computer science and operations 
research, political science and military strategy, cultural anthropology, ethics and moral philosophy, and 
economics. The cohesive force of the theory stems from its formal mathematical structure which allows the 
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practitioners to abstract away the common strategic essence of the actual biological, social or economic 
situation. Game theory creates a unified framework of abstract models and metaphors, together with a 
consistent methodology, in which these problems can be recast and analyzed. 

The appearance of game theory as an accepted physics research agenda is a relatively late event. It required 
the mutual reinforcing of two important factors: the opening of physics, especially statistical physics, towards 
new interdisciplinary research directions, and the sufficient maturity of game theory itself in the sense that 
it had started to tackle into complexity problems, where the competence and background experience of the 
physics community could become a valuable asset. Two new disciplines, socio- and econophysics were born, 
and the already existing field of biological physics got a new impetus with the clear mission to utilize the 
theoretical machinery of physics for making progress in questions whose investigation were traditionally 
connected to the social sciences, economics, or biology, and were formulated to a large extent using classical 
and evolutionary game theory (jSigmundl . Il993l ; iBalll , l2004t iNowakl . l2006al ) . The purpose of this review is 
to present the fruits of this interdisciplinary collaboration in one specifically important area, namely in 
the case when non-cooperative games are played by agents whose connectivity pattern (social network) is 
characterized by a nontrivial graph structure. 

The birth of game theory is usually dated to the seminal book of von Neumann and Morgensternl ( 19441 ). 
This book was indeed the first comprehensive treatise with a wide enough scope. Note however, that as for 
most scientific theories, game theory also had forerunners much earlier on. The French economist Augustin 
Cournot solved a quantity choice problem under duopoly using some restricted version of the Nash equi- 
librium concept as early as 1838. His theory was generalized to price rivalry in 1883 by Joseph Bertrand. 
Cooperative game theory concepts appeared already in 1881 in a work of Ysidro Edgeworth. The concept of 
a mixed strategy and the minimax solution for two person games were developed originally by Emile Borel. 
The first theorem of game theory was proven in 1913 by E. Zermelo about the strict determinacy in chess. 
A particularly detailed account of early (and late r) history of game theory is Pau l Walker's chronol ogy of 
game theory av ailable on the Web (IWalkerl. Il995h or William Poundstone's book (jPoundstond . Il992h . You 
can also consult iGambarelli and Owenl (|2004T k 

A very important mil estone in th e theory is John Nash's invention of a strategic equilibrium concept for 
non-cooperative games (Nash, 1950() . The Nash equilibrium of a game is a profile of strategies such that no 
player has a unilateral incentive to deviate from this by choosing another strategy. In other words, in a Nash 
equilibrium the strategies form "best responses" to each other. The Nash equilibrium can be considered as 
an extension of von Neumann's minimax solution for non-zero-sum games. Beside defining the equilibrium 
concept, Nash also gave a proof of its existence under rather general assumptions. The Nash equilibrium, 
and its later refinements, constitute the "solution" of the game, i.e., our best prediction for the outcome in 
the given non-cooperative decision situation. 

One of the most intriguing aspects of the Nash equilibrium is that it is not necessarily efficient in terms 
of the aggregate social welfare. There are many model situations, like the Prisoners Dilemma or the Tragedy 
of the Commons, where the Nash equilibrium could be obviously amended by a central planner. Without 
such supreme control, however, such efficient outcomes are made unstable by the individual incentives of 
the players. The only stable solution is the Nash equilibrium which is inefficient. One of the most important 
tasks of game theory is to provide guidelines (normative insight) how to resolve such social dilemmas, and 
provide an explanation how microscopic (individual) agent-agent interactions without a central planner may 
still generate a (spontaneous) aggregate cooperation towards a more efficient outcome in many real-life 
situations. 

Classical (rational) game theory is based upon a number of severe assumptions about the structure of a 
game. Some of these assumptions were systematically released during the history of the theory in order to 
push further its limits. Game theory assumes that agents (players) have well defined and consistent goals and 
preferences which can be described by a utility function. The utility is the measure of satisfaction the player 
derives from a certain outcome of the game, and the player's goal is to maximize her utility. Maximization (or 
minimization) principles abound in science. It is, however, worth enlightening a very important point here: 
the maximization problem of game theory differs from that of physics. In a physical theory the standard 
situation is to have a single function (say, a Hamiltonian or a thermodynamic potential) whose cxtremum 
condition characterizes the whole system. In game theory the number of functions to maximize is typically 
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as much as the number of interacting agents. While physics tries to optimize in a (sometimes rugged but) 
fixed landscape, the agents of game theory continuously restructure the landscape for each other in pursuit 
of their selfish individual optimumP"] 

Another key assumption in the classical theory is that players are perfectly rational (hyper-rational), 
and this is common knowledge. Rationality, however, seems to be an ill-defined concept. There are extreme 
opinions arguing that the notion of perfect rationality is not more then pure tautology: rational behavior 
is the one which complies with the directives of game theory, which in turn is based on the assumption of 
rationality. It is certainly not by chance that a central recurrent theme in the history of game theory is 
how to define rationality. In fact, any working definition of rationality is a negative definition, not telling us 
what rational agents do, but rather what they do not. For e xample, the usual minima l defin ition states that 
rational players do not play strictly dominated strategies (|Aumannl . Il992l : iGibbonsl . Il992h . Paradoxically, 
the straightforward application of this definition seems to preclude cooperation in games involving social 
dilemmas like a (finitely repeated) Prisoners Dilemma or Public Good games, whereas cooperation do occur 
in real social situations. Another problem is that in many games low level notions of rationality enable several, 
theoretically permitted outcomes of the game. Some of these are obviously more successful predictions then 
others in real-life situations. The answer of the classical theory for these shortcomings was to refine the 
concept of rationality and equivalently the concept of the strategic equilibrium. 

The post-Nash history of game theory is mostly the history of such refinements. The Nash equilibrium 
concept seems to have enough predicting power in static games with complete information. The two mayor 
streams of extensions are toward dynamic games and games with incomplete information. Dynamic games are 
the ones where the timing of decision making plays a role. In these games the simple Nash equilibrium concept 
would allow outcomes which are based on non-credible threats or promises. In order to exclude these spurious 
equilibria Selten has introduced the concept of a su bqame perfect Nash equilibrium, which requires Nash-typc 
optimality in all possible subgames (|Seltenl . [l965h . Incomplete information, on the other hand, means that 
the players' available strategy sets and associated payoffs (utility) are not common knowledge0 In order 
to handle games with incomplete information the theory requires that the players hold beliefs about the 
unknown parameters and these believes are consistent (rational) is some properly defined sense. This has led 
to the concept of Bayesian Nas h equilibrium for static games and to perfect B ayesian equilibrium or sequential 
equilibrium in dynamic games ( Fudenberg and Tirold . Tl99lHGibbonsl . 1992|) . Many other refinements (Pareto 
efficiency, risk dominance, focal outcome, etc.) with lesser domain of applicability h ave been proposed to 
provi de guidelines for equilibrium selection in the case of multiple Nash equilibria ( Harsanvi and Seltenl . 



19881 ). Other refinements, like trembling hand perfection, still within the classical framework, opened up 



the way for eroding the assumption of perfect rationality. However, this program has only reached its full 
potential by the general acceptance of bounded rationality in the framework of evolutionary game theory 

Despite the undoubted success of classical game theory the paradigm has soon confronted its limitations. 
In many specific cases further progress seemed to rely upon the relaxation of some of the key assumptions. A 
typical example where rational game theory seems to give an inadequate answer is the "backward induction 
paradox" related to repeated (iterated) social dilemmas like the Repeated Prisoner's Dilemma. According to 
game theory the only subgame perfect Nash equilibrium in the finitely repeated game is the one determined 
by backward induction, i.e., when both players defect in all rounds. Nevertheless, cooperation is frequently 
observed in real-life psycho-economic experiments. This result either suggests that the abstract Prisoner's 
Dilemma game is not the right model for the situation or that the players do not fulfill all the premises. 
Indeed, there is good reason to believe that many realistic problems, in which the effect of an agent's 
action depends on wha t other agents do are far more complex that perfect rationality of the players could 
be postulated (jConlisk . Il996l) . The standard deductive reasoning looses its appeal when agents have non- 



1 A certain class of games, the so-called potential games, can be recast into the form of a single function optimization problem. 
But this is an exception rather than a rule. 

2 Incomplete information differs from the similar concept of imperfect information. The latter refers to the case when some 
of the history of the game is unknown to the players at the time of decision making. For example Chess is a game with 
perfect information because players know the whole previous history of the game, whereas the Prisoners dilemma is a game 
with imperfect information due to the simultaneity of the players' decisions. Nevertheless, both arc games with complete 
information. 
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negligible cognitive limitations, there is a cost of gathering information about possible outcomes and payoffs, 
the agents do not have consistent preferences, or the common knowledge of the players' rationality fails to 
hold. A possible way out is inductive reasoning, i.e., a trial-and-crror approach, in which agents continuously 
form hypotheses about their environment, build strategies accordingly, observe their performance in practice, 
and verify or discard their assumptions based on empirical success rates. In this approach the outcome 
(solution) of a problem is determined by the evolving mental state (mental representation) of the constituting 
agents. Mind necessarily becomes an endogenous dynamic variable of the model. This kind of bounded 
rationality may explain that in many situations people respond instinctively, play according to heuristic 
rules and social norms rather then adopting the strategies indicated by rational game theory. 

Bounded rationality becomes a natural concept when the goal of the theory is to understand animal 
behavior. Individuals in an animal population do not make conscious decisions about strategy, even though 
the incentive structure of the underlying formal game they "play" is identical to the ones discussed under the 
assumption of perfect rationality in classical game theory. In most cases the applied strategies are genetically 
coded and maintained during the whole life-cycle, the strategy space is constrained (e.g., mixed strategies 
may be excluded), or strategy adoption or change is severely restricted by biologically predetermined learning 
rules or mutation rates. The success of a strategy applied is measured by biological fitness, which is usually 
related with reproductive success. 

Evolutionary game theory is an extension of the classical paradigm towards bounded rationality. There 
is however, another aspect of the theory which was swept under the rug in the classical approach, but gets 
special emphasis in the evolutionary version, namely dynamics. Dynamical issues were mostly neglected 
classically, because the assumption of perfect rationality made such questions irrelevant. Full deductive 
rationality allows the players to derive and construct the equilibrium solut ion inst a ntane ously. In this spirit, 
when dynamic methods were still applied, like Brown's fictitious play (Brown], Il95l[ ). they only served 
as a technical aid for deriving the equilibrium. Bounded rationality, on the other hand, is inseparable 
from dynamics. Contrary to perfect rationality, bounded rationality is always defined in a positive way, 
postulating what boundedly rational agents do. These behavioral rules are dynamic rules, specifying how 
much of the game's earlier history is taken into consideration (memory), how long agents would think ahead 
(short-sightedness, myope), how they search for available strategies (search space), how they switch for more 
successful ones (adaptive learning) , and what all these mean at the population level in terms of frequencies 
of strategies. 

The idea of bounded rationality has the most obvious relevance in biology. It is not too surprising that 
early applications of the evolutionary perspective of game theory appea red in the bio logy literature. It is 
customary to cite R. A. Fisher's analysis on the equality of the sex ratio (|Fished . fl930h as one of the initial 
works with such ideas, and R. C. Lewontin's e arly paper whic h was probably the first to make a formal 
connection between evolution and game theory ( Lewont inl . 1 1 96 1 ) . However, the real onset of the theory can 
be dated to two semina l books in the early 80s: J. Maynard Smith's "Evolution and the Theory of Games" 
( Mavnard Smithl . 1982 ). which in troduced the c oncept of evolutionary stable strategies, and R. Axelrod's 
"The Evolution of Cooperation" ( Axelrodl . Il984f l. which opened up the field for economics and the social 
sciences. Whereas biologists have used game theory to understand and predict certain outcomes of organic 
evolution and animal behavior, the social sciences community welcomed the method as a tool to understand 
social learning and "cultural evolution", a notion referring to changes in human beliefs, values, behavioral 
patterns and social norms. 

There is a static and a dynamic perspective of evolutionary game theory. Maynard Smith's definition of 
the evolutionary stability of a Nash equilibrium is a static concepts which docs not require solving time- 
dependent dynamic equations. In simple terms evolutionary stability means that a rare mutant cannot 
successfully invade the population. The condition for evolutionary stability can be checked directly without 
incurring complex dynamic issues. The dynamic perspective, on the other hand, operates by explicitly 
postulating dynamical rules. These rules can be prescribed as deterministic rules at the population level 
for the rate of change of strategy frequencies or as microscopic stochastic rules at the agent level (agent- 
based dynamics). Since bounded rationality may have different forms, there are many different dynamical 
rules one can consider. The most appropriate dynamics depends on the specificity of the actual biological 
or socio-economical situation under study. In biological applications the Replicator dynamics is the most 
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natural choice, which can be derived by assuming that payoffs arc directly related to reproductive success. 
Socio-economic applications may require other adjustment or learning rules. Both the static and dynamic 
perspective of evolutionary game theory provide a basis for equilibrium selection when the classical form of 
the game has multiple Nash equilibria. 

Once the dynamics is specified the major concern is the long run behavior of the system: fixed points, 
cycles, and their stability, chaos, etc., and the connection between static concept (Nash equilibrium, evolu- 
tionary stability) and dynamic predictions. The connection is far from being trivial, but at least for normal 
form games with a huge class of "reasonable" population-level dynamics the Folk theorem of evolutionary 
game theory holds, asserting that stable rest points are Nash equilibria. Moreover, in games with only two 
strategies evolutionary stability is practically equivalent to dynamic stability. In general, however, it turns 
out that a static, equilibrium-based analysis is insufficient to provide enou gh insight into the long-run be - 
havior of payoff maximizing agents with general adaptive learning dynamics (jHofbauer and Sigmundl , 120031 ). 
Dynamic rules of bounded rationality should not necessarily reproduce perfect rationality results. 

As was argued above, the mission of evolutionary game theory was to remedy three key deficiencies of the 
classical theory: (1) bounded rationality, (2) the lack of dynamics, and (3) equilibrium selection in the case 
of multiple Nash equilibria. Although this mission was accomplished rather successfully, there was a scries 
of weaknesses remaining. Evolutionary game theory in its early form considered population dynamics on the 
aggregate level. The state variables whose dynamics are followed are variables averaged over the population 
such as the relative strategy abundances. Behavioral rules, on the other hand, control the system on the 
microscopic, agent level. Agent decisions are frequently asynchronous, discrete and may contain stochastic 
elements. Moreover, agents may have different individual preferences, payoffs, strategy options (heterogeneity 
in agent types) or be locally connected to well-defined other agents (structural heterogeneity). 

In large populations these microscopic fluctuations usually average out and produce smooth macroscopic 
behavior for the aggregate quantities. Even though the underlying microscopic rules can be rather different, 
there is a wide class of models where the standard population level dynamics, e.g., the replicator dynamics, 
can indeed be microscopically justified. In these situations the mean-field analysis, assuming an infinite, 
homogeneous population with unbiased random matching, can provide a good qualitative description. In 
other cases, however, the emerging aggregate level behavior may easily differ even qualitatively from the 
naive mean-field analysis. Things can go awry especially when the population is largely heterogeneous in 
agent types and/or when the topology of the interaction graph is nontrivial. 

Although the importance of heterogeneity and structural issues was recognized long time ago (jFollmer , 



19741 ). the systematic investigation of these questions is still in the forefront of research. The challenge 



is high, because heterogeneity in both agent types and connectivity structure breaks down the symmetry 
of agents, and thus requires a dramatic change of perspective in the description of the system from the 
aggregate level to the agent level. The resulting huge increase in the relevant system variables makes most 
standard analytical techniques, operating with differential equations, fixed points, etc., largely inapplicable. 
What remains is agent based modeling, meaning extensive numerical simulations and analytical techniques 
going beyond the traditional mean-field level. Although we fully acknowledge the relevance and significance 
of the type heterogeneity problem, this Review will mostly concentrate on structural issues, i.e., we will 
assume that the population we consider consists of identical players (or at least the number of different 
player roles is small), and the players' only asymmetry stems from their unequal interaction neighborhoods. 

It is well-known that real-life interaction networks can possess a rather complex topology, which is far 
from the traditional mean-field case. On one hand, there is a large class of situations where the interaction 
graph is determined by the geographical location of the participating agents. Biological games arc good 
examples. The typical structure is two-dimensional. It can be modeled by a regular two-dimensional (2D) 
lattice or a graph with nodes in 2D and an exponentially decaying probability of distant links. On the 
other hand, games, motivated by economic or social situations, are typically played on scale free or small 
word networks, which have rather peculiar statistical properties (jAlbert and Barabasil . 12002) . Of course, a 
fundamental geographic embedding cannot be ruled out either. Hierarchical structures are also possible with 
several levels. In many cases the inter-agent connectivity is not rigid but can continuously evolve in time. 

In the simplest spatial evolutionary games agents are located on the sites of a lattice and play repeatedly 
with their neighbors. Individual income arises from two-person games played with neighbors, thereby the 



G 



total income depends on the distribution of strategies within the neighborhood. From time to time agents 
are allowed to modify their strategies in order to increase their utility. Following the basic Darwinian 
selection idea, in many models the agents adopt (learn) one of the neighboring strategics that has provided 
a higher income. Similar models are widely and fruitfully used in different areas of science to determine 
macroscopic behavior from microscopic interactions. Apparently, many aspects of these models seem to be 
similar to many-particle systems, that is, one can observe different phases and phase transitions when the 
model parameters are tuned. The striking analogies inspired many physicists to contribute to the deeper 
understanding of the field by successfully adopting approaches and tools from statistical physics. 

Evolutionary games can also exhibit behaviors which do not appear in typical equilibrium physical systems. 
These aspects require the methods of non-equilibrium statistical physics, where such complications have been 
investigated for a long time. In evolutionary games the interactions are frequently asymmetric, the time- 
reversal symmetry is broken for the microscopic steps, and many different (evolutionarily) stable states can 
coexist by forming frozen or self-organizing patterns. 

In spatial models the short-range interactions limit the number of agents who can affect the behavior of 
a given player in finding her best solution. This process can be disturbed fundamentally if the number of 
possible strategics exceeds the number of neighbors. Such a situation can favor the formation of different 
strategy associations that can be considered as complex agents with proper spatio-temporal structure, and 
whose competition will determine the final stationary state. In other words, spatial evolutionary games 
provide a mathematical framework for studying the emergence of structural complexity that characterizes 
living material. 

Very recently the research of evolutionary games has interfered with the extensive investigation of net- 
works, because the actual social networks characterizing human interactions possess highly nontrivial topo- 
logical properties. The first results clearly demonstrated that the topological features of these networks can 
influence significantly their behavior. In many cases "games on graphs" differ qualitatively from their coun- 
terparts defined in a well-mixed (mean-field) population. The thorough mathematical investigation of these 
phenomena requires an extension of the traditional tools of non-equilibrium statistical physics. Evolutionary 
games lead to dynamical regimes much richer and subtle than those attainable with traditional equilibrium 
or non-equilibrium statistical physics models. We need revolutionary new concepts and methods for the 
characterization of the emerging complex, self-organizing patterns. 

The setup of this review is as follows. The next three Sections summarize the basic concepts and methods 
of rational and evolutionary game theory. Our aim was to cut the material into a digestible size, and 
give a tutorial type introduction to the field, which is traditionally outside the standard curriculum of 
physicists. Admittedly, many highly relevant and interesting aspects have been left out. The focus is on 
non-cooperative matrix games (normal form games) with complete information. These are the games whose 
network extensions have received the most attention in the literature so far. Section 5 is devoted to the 
structure of realistic social networks on which the games can be played. Sections 6 to 8 review the dynamic 
properties of three prominent families of games: the Prisoner's Dilemma, the Rock-Scissors-Papcr game, and 
Competing Associations. These games show a number of interesting phenomena, which occur due to the 
topological peculiarities of the underlying social network, and nicely illustrate the need to go beyond the 
mean-field approximation for a quantitative description. We discuss open questions and provide an outlook 
for future research areas in Sec. 9. There are three Appendices: the first gives a concise list of the most 
important games discussed in the paper, the second summarizes noteworthy strategies, and the third gives 
a detailed introduction to the Generalized Mean-field Approximation (Cluster Approximation) widely used 
in the main text. 



2. Rational game theory 



The goal of Sections 2 to 4 is to provide a concise summary of the necessary definitions, concepts and 
methods of rational (classical) and evolutionary game theory, which can serve as a background knowledge in 
later sections. Most of the m a terial pres e nted h ere is treated in much more det a il in stan d ard te x tbooks like 
Fudenberg and Tirolel (|l99lh : iGibbond (|l992h : Irlofbauer and Sigmundl (|l998h : IWeibulli (|l995t ): ISamuelson 



(|l997h : iGintid (l2000h : ICressmanl (l2003h and in a recent review bvl Hofbauer and Sigmundl (|2003l ) 



2.1. Games, payoffs, strategies 



A game is an abstract formulation of an interactive decision situation with possibly conflicting interests. 
The normal (strategic) form representation of a game specifies (1) the players of the game, (2) their feasible 
actions (to be called pure strategies), and (3) the payoffs received by them for each possible combination 
of actions (the action or strategy profile) that could be chosen by the players. Let n = 1, . . . , N denote the 
players; S n = {e„i, e„2, . . . e n Q} the set of pure strategies available to player n, with s n € S n an arbitrary 
element of this set; (si, . . . , sjv) a given strategy profile of all players; and u n (si, . . . , sjv) player n's payoff 
function (utility function), i.e., the measure of her satisfaction if the strategy profile (si, . . . , sjv) gets realized. 
Such a game can be denoted as G = {Si, . . . ,Sn',u\, . . . . Mjy}H] 

In the case when there are only two players n = 1,2 and the set of available strategies is discrete, 
Si = {ex, . . . , cq}, S2 = {fi, • ■ • , /r} it is customary to write the game in a bi-matrix form G = (A, B T ), 
which is a shorthand for the payoff tablj^l 





Player 2 




/1 • • • fn 


ei 


(A n ,B^) ••• (A 1R ,Bf R ) 


Player 1 : 




<-Q 


( a Q1i B qi) ■ ■ ■ (Aqr, B^ r ) 



(1) 



Here the matrix A; 



uiieijj) [resp. 



U2{ei, fj)\ denotes Player l's [resp., Player 2's] payoff for the 



strategy profile (e^, /j)0 

Two-player normal-form games can be symmetric (so called matrix games) and asymmetric {bi-matrix 
games), with symmetry referring to the roles of players. For a symmetric game players arc identical in all 
possible respects, they possess identical strategy options and payoffs, i.e., necessarily R~Q and B = A. It 
does not matter for a player whether she should play in the role of Player 1 or Player 2. A symmetric normal 
form game is fully characterized by the single payoff matrix A, and we can formally write G = (A, A T ) = 
A. The Hawk-Dove game (see Appendix A for a detailed description of games mentioned in the text), the 
Coordination game or the Prisoner's Dilemma are all symmetric games. 

Player asymmetry, on the other hand, is often an important feature of the game like in male-female, buyer- 
seller, owner-intruder, or sender-receiver interactions. For asymmetric games (bi-matrix games) B ^ A, thus 
both matrices should be specified to define the game, G = (A, B T ). It makes a difference whether the player 
is called to act in the role of Player 1 or Player 2. 

Sometimes the players change roles frequently during interactions like in the Sender-Receiver game of 
communication, and a symmetrized version of the underlying asymmetric game is considered. The players 
can act as Player 1 with probability p and as Player 2 with probability 1 — p. These games are called role 



3 Note that this definition only refers to static, one-shot games with simultaneous decisions of the players, and complete 
information. The mo re general case (dynamic games or games wi th incomplete information) is usually given in the extensive 
form representation jFudcnbcr g and Tirol c, 1991; Crcssinan, 2003|), which, in addition to the above list, also specifies: (4) when 
each player has the move, (5) what are the choice alternatives at each move, and (6) what information is available for the player 
at the moment of the move. Extensive form games can also be cast in normal form, but this may imply a substantial loss of 
information. We do not consider extensive form games in this review. 

4 It is customary to define B as the transpose of what appears in the table. 

5 Not all normal form games can be written conv eniently in a matrix form. The Cournot and Bertrand games are simple 
examples llGibbonsl, [l992: Mar sili and Zhand. [l997h . where the strategy space is not discrete, <S n = [0, 00), and payoffs should 
be treated in functional form. This is, however, only a minute technical difference, the solution concepts and methods remain 
unchanged. 



8 



games (jHofbauer and Sigmundl , 119981 ). When p — 1/2 the game is symmetric in the higher dimensional 



strategy space S± x S2, formed by the pairs of the elementary strategies. 

Symmetric games, whose payoff matrix is symmetric A — A T are called doubly symmetric, and can be 
denoted as G — {A, A) = A. These games belong to the so-called partnership or potential games. See Section 
2.3 for a discussion. The asymmetric game G = (A, —A) is called a zero-sum game. This is the class wher e 
the existence of an equilibrium was first proved in the form of the Minimax Theorem (jvon Neumann! . 1928). 



The strategies that label the payoff matrices are pure strategies. In many games, however, the players 
can also play mixed strategies, which are probability distributions over pure strategies. Poker is a good 
example, where (good) players play according to a mixed strategy over the feasible actions of "bluffing" and 
"no bluffing" . We can assume that players possess a randomizing device which can be utilized in decision 
situations. Playing a mixed strategy means that in each decision instance the player come up with one of 
her feasible actions with a certain prc-assigncd probability. Each mixed strategy corresponds to a point p of 
the mixed strategy simplex 

Aq = |p = (pi, . . .,p Q ) G R q : Pq > °>I>? = 1 j . (2) 

whose corners are the pure strategies. In the two-player case of Eq. (1) a strategy profile is the pair (p, q) 
with p £ Aq, q G A/;, and the expected payoffs of player 1 and 2 can be expressed as 

ui(p,q) =p- Aq, u 2 (p,q) = p - B T q = q ■ Bp. (3) 

Note that in some games mixed strategies may not be allowed. 

The timing of a normal form game is as follows: (1) players independently, but simultaneously choose one 
of their feasible actions (i.e., without knowing the co-players' choices), (2) players receive payoffs according 
to the action profile realized. 

2.2. Nash equilibrium and social dilemmas 

Classical game theory is based on two key assumptions: (1) perfect rationality of the players, and (2) that 
this is common knowledge. Perfect rationality means that the players have well-defined payoff functions, 
and they are fully aware of their own and the opponents' strategy options and payoff values. They have no 
cognitive limitations in deducing the best possible way of playing whatever the complexity of the game is. 
In this sense computation is costless and instantaneous. Players are capable of correctly assessing missing 
information (in games with incomplete information) and process new information revealed by the play of the 
opponent (in dynamic games) in terms of probability distributions. Common knowledge, on the other hand, 
implies that beyond the fact that all players are rational, they all know that all pl ayers are rational, and 
that all player s know that all players know that all are rational, etc., ad infinitum ( Fudenberg and Tirolei . 
199lUGibbonsl . ll992l) . 



A strategy s n of player n is strictly dominated by a (pure or mixed) strategy s' n , if for each strategy profile 
S- n = (si, . . . , s n _i, s n+ i, . . . , sjy) of the co-players, player n is always better off playing s' n than s n , 

Vs_„ : u n (s' n ,s_ n ) > u n (s n ,s_ n ). (4) 

According to the standard minimal definition of rationality ( Aumannl . 19921 ). rational players do not play 



strictly dominated strategies. Thus strictly dominated strategies can be iteratively eliminated from the 
problem. In some cases, like the Prisoner's Dilemma, only one strategy profile survives this procedure, which 
is then the perfect rationality "solution" of the game. It is more common, however, that iterated elimination of 
strictly dominated strategies do not solve the game, because either there are no strictly dominated strategics 
at all, or more than one profiles survive. 

A stronger solution concept, which is applicable for all games, is the concept of a Nash equilibrium. A 
strategy profile s* = (s\, . . . , s* N ) of a game is said to be a Nash equilibrium (NE), iff 

Vn,Vs„^s*: u n (s* n , s*_ n ) > u n (s n , s*_ n ). (5) 
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In other terms, each agent's strategy s* is a best response (BR) to the strategies of the co-players, 

Vn : 4 = BR(s*J, (6) 
where 

BR(s_ n ) = argmax Sn ti„(.s n , s_ n ). (7) 

When the inequality above is strict, s* is called a strict Nash equilibrium. The NE condition assures that 
no player has a unilateral incentive to deviate and play another strategy, because, given the others' choices, 
there is no way she could b e better off. One of the most fundamental results of classical game theory is 
Nash's theorem (lNashl . ll950h . which asserts that in normal- form games with a finite number of players and a 



finite number of pure strategies there exists at least one NE, possibly involving mixed strategies. The proof 
is based on Kakutani's fixed point theorem. An NE is called a symmetric (Nash) equilibrium if all agents 
play the same strategy in the equilibrium profile. 

The Nash equilibrium is a stability concept, but only in a rather restricted sense: stable against single- 
agent (i.e., unilateral) changes of the equilibrium profile. It does not speak about what could happen if more 
than one agent changed their strategies at the same time. In this latter case there are two possibilities which 
classify NEs into two categories. It may happen that there exists a suitable collective strategy change that 
increases some players' payoff while not d ecreasing all others'. Clearly then the original NE was inefficient 
[sometimes called a deficient equilibrium ( Rapoport and Guver . 196(f )], and in theory can be emended by 



the new strategy profile. In all other cases the NE is such that any collective strategy change makes at least 
one player worse off (or, in the degenerate case, all payoffs remain the same). These NEs are called Pareto 
efficient, and there is no obvious way for improvement. 

Pareto efficiency can be used as an additional criterion (a so-called refinement) to the NE concept to 
provide equilibrium selection in cases when the NE concept alone would provide more than one solution 
to the game like in some Coordination problems. For example, if a game has two NEs, the first is Pareto 
efficient, the second is not, then we can say that the refined strategic equilibrium concept (in this case the 
Nash equilibrium concept plus Pareto efficiency) predicts that the outcome of the game is the first NE. 
Preferring Pareto-efficient equilibria to deficient equilibria becomes an inherent part of the definition of 
rationality. 

It may well occur that the game has a single NE, which is, however, not Pareto efficient, and thus the social 
welfare (the sum of the individual utilities) is not maximized in equilibrium. Two archetypical examples are 
the Prisoner's Dilemma and the Tragedy of the Commons (see later in Sec. 2.5). Such situations are called 
social dilemmas, and their analysis, avoidance or possible resolution is one of the most fundamental issues 
of economics and social sciences. 

Another refinement conce pt which could serve as a guideline for equilibrium selection is risk dominance 
( Harsanvi and Selten . 19881 ). A strategy s' n risk dominates another strategy s„, if the former has higher 



expected payoff against an opponent playing all his feasible strategies with equal probability. So playing 
strategy s n would have higher risk, if the opponent were, for some reason, irrational, or if were unable to 
decide between more than one equally appealing NEs. The concept of risk dominance is the precursor to 
stochastic stability to be discussed in the sequel. Note that Pareto efficiency and risk dominance may well 
give contradictory advise. 

2.3. Potential and zero-sum games 

In general the number of utility functions to consider in a game equals to the number of players. However, 
there are two special classes where a single function is enough to characterize the strategic incentives of 
all players: potential games and zero-sum games. The existence of these single functions make the analysis 
more transpare nt, and as we will see, the met hods of statistical physics directly applicable. 



By definition (jMonderer and Shaplevl . 119961 ). if there exists a function V = V(si, S2, ■ . ■ , sn) such that for 



each player n = 1, . . . , N the utility function differences satisfy 

u n (s' n ; s- n ) - u n {s n ] s- n ) = V(s' n ; s_„) - V(s n ; s- n ), (8) 
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then the game is called a potential game with V being its potential. If the potential exists it can be thought of 
as a single, fixed landscape, common for all players, in which they try to reach its maximum. In the case of the 
two-player game in Eq. (1) V is a Q x R matrix, which represents a two-dimensional discretized landscape. 
The concept can be trivially extended to more players, in which case the dimension of the landscape is the 
number of players. For games with continuous strategy space (e.g., mixed strategics) the finite differences 
in Eq. (8) can be replaced by partial derivatives and the landscape is continuous. 

For potential games the existence of a Nash equilibrium, even in pure strategies, is trivial if the strategy 
space (the landscape) is compact: the global maximum of V, which then necessarily exists, is a pure strategy 
NE. 

For a two-player game, G = (A, B T ), it is easy to formulate a simple sufficient condition for the existence 
of a potential. If the payoffs are equal for the two players in all strategy configurations, i.e., A = B T = V , 
then V is a potential, as can b e checked directly. These games are also called "partnersh ip games" or 
"games with common interests" (jMonderer and Shaplevl . 119961: iHofbauer and Sigmundl . Il998h . If the game 



is symmetric, i.e., A = B this condition implies that the potential is a symmetric matrix V T = V . 

There is, however, an even wider class of games for which a potential can be defined. Indeed, notice that 
Nash equilibria are invariant under certain rescaling of the game. Specifically, the game G' = (A 1 , B lT ) is 
said to be Nash- equivalent to G = (A, B T ), and denoted G ~ G' , if there exist constants a, (3 > and c r , 
d q such that 

A' qr = a A qr + c r - B' rq = /3 B rq + d q . (9) 

According to this, the payoffs can be freely multiplied, and arbitrary constants can be added to columns 
of Player l's payoff and the rows of Player 2's payoff - the Nash equilibria of G and G' are the same. If 
there exists a Q x R matrix V such that (A,B T ) ~ (V, V), the game is called a rescaled potential game 



(jrlofbauer and Sigmundl . Il998l ) 



The other class of games with a single strategy landscape is (rescaled) zero-sum games, (A,B T ) ~ 
(V, —V). Zero-sum games are only defined for two players. Whereas for potential games the players try to 
maximize V along their associated dimensions, for zero-sum games Player 1 is a maximizer and Player 2 
is a minimizer of V along their respective strategy spaces. The existence of a Nash equilibrium in mixed 
strategics (p, q) € Aq x A# follows from Nas h's theorem, bu t in fa ct it was proved earlier by von Neumann. 
Von Neumann's Minimax Theorem asserts ( von Neumann! . 1928f ) that each zero-sum game can be associ- 



ated with a value v, the optimal outcome. The payoffs v and —v, respectively, are the best the two players 
can achieve in the game if both are rational. Denoting player l's expected payoff by u(p, q) = p ■ Vq, the 
value v satisfies 

v = maxmin u(p, q) = minmax u(p, q), (10) 

i.e., the two extremization steps can be exchanged. The minimax theorem also provides a straightforward 
algorithm how to solve zero-sum games {minimax algorithm), with direct extension (at least in theory) to 
dynamic zero-sum games such as Chess or Go. 

2.4. NEs in two-player matrix games 

A general two-player, two-strategy symmetric game is defined by the matrix 



c 



where a, 6, c, d are real parameters. Such a game is Nash-equivalent to the rescaled game 

os 4> 
sin <fi 



I cos0 \ , d - b 

A = ( n )> tan0= ^' (12) 
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Let pi characterize Player 1 's mixed strategy p = (pi , 1 —p\ ) , and q\ Player 2 's mixed strategy q = (qx , 1 — qi ) . 
By introducing the notation 



NEi 


* 

■Pi 


= Qi 


= 1; 


NE 2 


* 

■Pi 


= ?i 


= 0; 


NE 3 


■Pi 


= 1i 


= r; 



r ~ ^£ ~ 1 + cot^' (13) 
the following classes can be distinguished as a function of the single parameter <p: 

Coordination Class [0 < <f> < 7r /2, i.e., a — c > 0, c? — 6 > 0]. There are two strict, pure strategy NEs, and 
one non-strict, mixed strategy NE: 



(14) 



All the NEs are symmetric. The prototypical example is the Coordination game (see Appendix A. 3 for 
details). 

Anti-Coordination Class [tt < </> < 37r/2, i.e., a — c<0, d — b < 0} There are two strict, pure strategy 
NEs, and one non-strict, mixed strategy NE: 

NEx :p* = l,q* = 0; 

NE 2 :^ = 0,^ = 1; (15) 
NE 3 : Pl=q{=r- 

NEi and NE 2 are asymmetric, NE3 is a symmetric equilibrium. The most important games belonging to 
this class are the Hawk-Dove, the Chicken, and the Snowdrift games (see Appendix A). 
Pure Dominance Class [tt/2 < </> < 7ror37r/2 < <fi < 2ir, i.e., (a — c)(d — b) < 0]. One of the pure 
strategies is strictly dominated. There is only one Nash equilibrium, which is pure, strict, and symmetric: 

f pi = q\ = if tt/2 < (j)< 7T 
NEi : i 1 (16) 

I P* = Q* = 1 if 3tt/2 < <P< 2tt 

The best example is the Prisoner's Dilemma (see Appendix A. 7 and later sections). 

On the borderline between the different classes games are degenerate and a new equilibrium appears or 
disappears. At these points the appearing/disappearing NE is always non-strict. Figure 1 depicts the phase 
diagram. 

The classification above is based on the number and type of Nash equilibria. Note that this is only a 
minimal classification, since these classes can be further divided using other properties. For instance, the 
Nash-equivalency relation does not keep Pareto efficiency invariant. The latter is a separate issue, which 
does not depend directly on the combined parameter </>. In the Coordination Class, unless a = d, only one of 
the three NEs is Pareto efficient: if a > d it is NEx, if a < d it is NE 2 . NE 3 is never Pareto efficient. In the 
Anti-Coordination Class NEx and NE 2 are Pareto efficient, NE 3 is not. The only NE of the Pure Dominance 
Class is Pareto efficient when d > a for tt/2 < cf> < ir, and when d < a for 37r/2 < <j> < 2ir. Otherwise the 
NE is deficient (social dilemma). 

What can we say when the strategy space contains more than two pure strategies, Q > 2? It is hard to 
give a complete classification as the number of the possible classes increases exponentially wi th Q dBroom , 



1995 



2000 ) . A classificati on based on the replicator dynamics (see later) is available for Q = 3 ( Bomzei . 1983 



ZeemanL fl98oT ) 



Wh at we surely know is that for any finite game which allows mixed strategies, Nash's theorem ((Nasty, 
Il950h assures that there is at least one Nash equilibrium, possibly in mixed strategies. Nash's theorem is 
only an existence theorem, and in fact the typical number of Nash equilibria in matrix games increases 
rapidly as the number of pure strategies Q increases. 
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Fig. 1. Nash equilibria in 2 X 2 matrix games A! , defined in Eq. (12), classified by the angle <j>. Filled boxes denote strict NE, 
empty boxes non-strict NE in the strategy space pi vs q\. The i— > shows the direction of motion of the mixed strategy NE for 
increasing 0. 



It is customary to distinguish interior and boundary NEs with respect to the strategy simplex. An interior 
NE p* nt € intAQ is a mixed strategy equilibrium with < p* < 1 for all i = 1, . . . , Q. A boundary NE 
p* £ bdAQ is a mixed strategy equilibrium in which at least one of the pure strategies has zero weight, i.e., 
3i such that p* = 0. These solutions are situated on the Q — 2 dimensional boundary bdAQ of the Q — 1 
dimensional simplex Aq. Pure s trategy NEs are so metimes called "corner solutions". As a special case of 
Nash's theorem it can be shown (| Cheng et all 120041 ) that a finite symmetric game always possesses at least 
one symmetric equilibrium, possibly in mixed strategies. 

Whereas for Q = 2 we always have a pure strategy equilibrium, this is no longer true for Q > 2. A typical 
example is the Rock-Scissors-Paper game whose only NE is mixed. 

In case when the payoff matrix is regular, i.e., det A ^ 0, there can be at most one i nterior NE. An interior 
NE i n a matrix game is necessarily symmetric. If this exists it necessarily satisfies ( Bishop and Canningsl 
19781) 



flirt ~ J\f 



(17) 



where 1 is the Q-dimensional vector with all elements equals 1, and TV is a normalization factor to assure 
S.j[p* nt ]j = 1. The possible location of the boundary NEs, if they exist, can be calculated similarly by 
restricting (projecting) A to the appropriate boundary manifold under consideration. 

When the payoff matrix is singular det A = the game may have an extended set of NEs, a so-called NE 
component. An example for this will be given later in Section 3.2. 

In case of asymmetric games (bi-matrix games) the two players have different strategy sets and different 
payoffs. An asymmetric Nash equilibrium is now a pair of strategies. Each component of such a pair is 
a best response to the other component. The classification of bi- matrix games is more comp licated even 
for two possible strategies, Q = R = 2. The standard reference is iRapoport and Guvei (1966) who define 
three major classes based on the number of players having a dominant strategy, and then define altogether 
fourteen different sub-classes within. Note, however, that they only classify bi-matrices with ordinal payoffs, 
i.e., when the four elements of the 2x2 matrix are the numbers 1,2,3 and 4 (the rank of the util i ty) . 

A different taxonomy based on the replicator dynamics phase portraits is given by ICressman ( 2003f h 
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2.5. Multi-player games 



There are many socially and economically important examples where the number of decision makers 
involved is greater than two. Although sometimes these situations can be modeled as repeated play of 
simple pair interactions, there are many cases where the most fundamental unit of the game is irrcducibly 
of multi-player nature. These games cannot be cast in a matrix or bi-matrix form. Still the basic solution 
concept is the same: when played by rational agents the outcome should be a Nash equilibrium where no 
player has an incentive to deviate unilaterally 

We briefly mention one example here: the Tragedy of the Commons ( Gibbons! Il992 ). This abstract 



games exemplifies what has been one of the ma jor concerns o f political philosophy and economic thinking 



since at least David Hume in the 18th century (jHardinl . 119681 ): without central planning and global control, 



private incentives would lead to over utilization of public resources and insufficient contribution to public 
goods. Clearly, models of this kind provide valuable insight into deep socioeconomic problems like pollution, 
deforestation, mining, fishing, climate control, or environment protection, just to mention a few. 

The Tragedy of the Commons is a simultaneous move game with N players (farmers). The strategy for 
farmer n is to choose a non-negative number g n £ [0, oo), the number of goats, assumed to be a real number 
for simplicity, to graze on the village green. One goat implies a constant cost c, and a benefit v(G) to the 
farmer, which is a function of the total number of goats G = g± + . . . + gx grazing on the green. However, 
grass is a scarce resource. If there are many goats on the green the amount of available grass decreases. 
Goats become undernourished and their value decreases. We assume that v'(G) < and v"(G) < 0. The 
farmer's payoff is 

u n = [v(G)-c]g n . (18) 

Rational game theory predicts that the emerging collective behavior is a Nash equilibrium with choices 
(<7*, . . . , <7^) from which no farmer has a unilateral incentive to deviate. The first order condition for the 
coupled maximization problem du n /dg n = leads to 

»(ffn + G*_ n ) + g n v'{g n + G*_ n ) - c = (19) 

where G_ rl = g\ + . . . + g n -i + g n +i + • • • + 9n- Summing over the first order conditions for all players we 
get an equation for the equilibrium number of total goats, G* , 

v(G*) + —G*v'(G*)-c = 0, (20) 

Given the function v, this can be solved (numerically) to obtain G* . 

Note, however, that the social welfare would be maximized by another value G**, which maximizes the 
total profit Gv(G) — cG for G, i.e., satisfies the first order condition (note the missing 1/N factor) 

v(G**)+G**v'(G**)-c = 0. (21) 

Comparing Eqs. (20) and (21), we find that G* > G**, that is too many goats are grazed in the Nash 
equilibrium compared to the social optimum. The resource is overutilized. The NE is not Pareto efficient 
(it could be easily improved by a central planner or a law fixing the total number of goats at G**), which 
makes the problem a social dilemma. 

The same kind of inefficiency, and the possib le ways to overcome it, is studied actively by economists in 
Public Good game experiments ( Ledvardl 1995 ). See Appendix A. 8 for more details on Public Good games. 



2.6. Repeated games 



Most of the inter-agent interactions that can be modeled by abstract games are not one-shot relationships 
but occur repeatedly on a regular basis. When a one-shot game is played between the same rational players 
iteratively, a single instance of this series cannot be singled out and treated separately. The whole series 
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should be analyzed as one big "supcrgame" . What a player does early on can effect what others choose to 
do later on. 

Assume that the same game G (the so called stage game) is played a number of times T. G can be the 
Prisoner's Dilemma or any matrix or bi-matrix game discussed so far. The set of feasible actions and payoffs 
in the stage game at period t (t = 1, . . . , T) are independent of t and of the former history of the game. This 
does not mean, however, that actions themselves should be chosen independent of time and history. When 
G is played at period t, all the game history thus far is common knowledge. We will denote this repeated 
game as G(T) and distinguish finitely repeated games T < oo and infinitely repeated games T = oo. For 
finitely repeated games the total payoff is simply the sum of the stage game payoffs 



u t . (22) 



For infinitely repeated games discounting should be introduced to avoid infinite utilities. Discounting is a 
rcgularization method, which is, however, not a pure mathematical trick but reflects real economic factors. 
On one hand, discounting takes into account the probability p that the repetition of the game may terminate 
at any period. On the other hand, it represents the fact that the current value of a future income is less 
than its nominal value. If the interest rate of the market for one period is denoted r, the overall discount 
factor, representing both effects reads S = (1 — p)/(l + r) ( Gibbons . 19921 ). Using this, the present value of 
the infinite series of payoffs u\, u%, . . . ii PH 

oo 

U = ^T5 t - 1 u t . (23) 

i=l 

In the following we will denote a finitely repeated game based on G as G(T) and an infinitely repeated 
game as G(oo, S), < 6 < 1, and consider U in Eq. (22), resp. Eq. (23), as the payoff of the repeated game 
(supergamc) . 



Strategies as algorithms 

In one-shot static games with complete information like those we have considered in earlier subsections a 
strategy is simply an action a player can choose. For repeated games (and also for other kinds of dynamic 
games) the concept of a strategy becomes more complex. In these games a player's strategy is a complete 
plan of action, specifying a feasible action in any contingency in which the player may be called upon to act. 
The number of possible contingencies in a repeated game is the number of possible histories the game could 
have produced thus far. Thus the strategy is in fact a mathematical algorithm which determines the output 
(the action to take in period t) as a function of the input (the actual time t and the history of the actions 
of all players up to period t— 1). In case of perfect rationality there is no limit on this algorithm. However, 
bounded rationality may restrict feasible strategies to those that do not require too long memories or too 
complex algorithms. 

To illustrate how fast the number of feasible strategies for G(T) increases, consider a symmetric, iV-playcr 
stage game G with action space S = {ei, e<i, . . . , cq}. Let M denote the length of the memory required by 
the strategy algorithm. M = 1 if the algorithm only needs to know the actions of the players in the last 
round, M = 2 if knowledge of the last two rounds is required, etc. The maximum possible memory length 
at period T is T — 1. Given the player's memory, a strategy is a function Q NM — > Q. There are altogether 
QQ length-M strategies. Notice, however, that beyond depending on the game history a strategy may 
also depend on the time t itself. For instance, actions may depend whether the stage game is played on a 
weekday or on the weekend. In addition to the beginning of the game, in finitely repeated games the end of 
the game can also get special treatment, and strategies can depend on the remaining time T — t. As it will 
be shown shortly, this possibility can have a crucial effect on the outcome of the game. 



6 Although it seems reasonable to introduce a similar discounting for finitely repeated games, too, that would have no qualitative 
effect there. 
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In case of the Iterated Prisoner's Dilemma games some standard strategies like "A11C" (M = 0) "Tit- 
for -Tat" (TFT M = 1) or "Pavlov" (M = 1) are f inite-memory strategies, others like "Contrite TFT" 



(jBoerliist et all 119971 ; iPanchanathan and Bovdl . I2004T ) depends as input on the whole action history. How 



ever, the need for such a long-ter m memory can sometimes be traded off for some additional state variables 
("image score", "standing", etc.) ( Panchanathan and Bovdl . 2004 : Nowak and Sigmundl . 200ot l. For instance, 
these variables can code for the "reputation" of agents, which can be utilized in decision making, especially 
when the game is such that opponents are chosen randomly from the population in each round. Of course, 
then a strategy should involve a rule for updating these state variables too. S ometimes these strategies 
can be conveniently e xpressed as Finite State Machines (Finite State Automata) ( Binmore and Samuelsonl . 
1992HLindgrenlll997l ). 

The definition of a strategy should also prescribe how to handle errors (noise) if these have finite pos- 
sibility to occur. It is customary to distinguish two kinds of errors: implementation ("trembling hand") 
errors and perception errors. The first refers to unintended "wrong" actions like playing D accidentally in 
the Prisoner's Dilemma when C was intended. These errors usually become common knowledge for players. 
Perception errors, on the other hand, arise from events when the action was correct as intended, but one 
of the players (usually the opponent) interpreted it as another action. In this case players end up keeping a 
different track record of the game history. 



Subgame perfect Nash equilibrium 

What is the prediction of rational game theory for the outcome of a repeated game? As always the outcome 
should correspond to a Nash equilibrium: no player can have a unilateral incentive to change its strategy (in 
the supergame sense), since this would induce immediate deviation from that profile. However, not all NEs 
are equally plausible outcomes in a dyna mic game such as a rep e ated game. There are ones which are based 
on "non-credible threats and promises" ( Fudenberg and Tirolel ll99lflGibbonsl . ll992| ). A stronger c oncept 
than N ash equilibrium is needed to exclude these spurious NEs. Subgame perfection introduced by ISelte n 
()1965l ) is a widely accepted criterion to solve this problem. Subgame perfect Nash equilibria are those that 
pass a credibility test. 

A subgame of a repeated game is a subseries of the whole series that starts at period t > 1 and ends at the 
last period T. However, there are many subgames starting at t, one for each possible history of the (whole) 
game before t. Thus subgames are labeled by the starting period t and the history of the game before t. 
Thus when a subgame is reached in the game the players know the hist ory of play. By definition an NE of 
the game is subgame perfect if it is an NE in all subgames (|Seltenl . [l965h . 

One of the key results of rational game theory that there is no cooperation in the Finitely Repeated 
Prisoner's Dilemma. In fact the theorem is more general and states that if a stage game G has a single NE 
then the unique subgame perfect NE of G(T) is the one, in which the NE of G is played in every period. The 
proof is by backward induction. In the last round of a Prisoner's Dilemma rational players defect since the 
incentive structure of the game is the same as that of G. There is no room for expecting future rewards for 
cooperation or punishment for defection. Expecting defection in the last period they consider the next-to-last 
period, and find that the effective payoff matrix in terms of the remaining undecided T — 1 period actions 
is Nash equivalent to that of G (the expected payoffs from the last round constitutes a constant shift for 
this payoff matrix). The incentive structure is again similar to that of G and they defect. Considering the 
next-to-next-to-last round gives similar result, etc. The induction process can be continued backward till 
the first period showing that rational players defect in all rounds. 



The backward induction paradox 

There is substantial evidence from experimental economics showing that human players do cooperate 
in repeated interactions, especially in early periods when the end of the game is still far away. Moreover, 
even in situations which are seemingly best described as single-shot Prisoner's Dilemmas cooperation is not 
infrequent. The unease o f rational game theory to cope with these facts is sometimes referred to as the back- 
ward induction paradox ( Pettit and Sugden . 19891 ). Similarly, contrary to rational game theory predictions, 



16 



altruistic behavior is common in Public G ood experiments, and "fair behavior" appears fr equently in the 
Dictator game or in the Ultimatum game (jForsvthe et all Il994t ICamerer and Thalen . fl995h . 

Game theory has worked out a number of possible answers to these criticisms: players are rational but 
the actual game is not what it seems to be; players are not fully rational, or rationality is not common 
knowledge; etc. As for the first, it seems at least possible that the evolutionary ad vantages of cooperative 
beha vior have exerted a selective pressure to make humans hardwired for cooperation (jFehr and Fischbacher , 



2003). This would imply a genetic predisposition for the player's utility function to take into account the 



well-being of the co-player too. The pure "monetary" payoff is not the ultimate utility to consider in the 
analysis. Social psychology distinguishes various player types from altruistic to cooperative to competitive 
individuals. In the simplest setup the player's overall utility function is U n = au n + /3u m , i.e., is a linear 
combination of his own monetary payoff u„ and t he opponent's monetary payo ff u m , where the signs of the 
coefficients a and (3 depend on the player's type ( Brosid . 120021 : Weibull . 2004 ). Clearly such a dependence 
may redefine the incentive structure of the game, and may transform a would-be social dilemma setup into 
another game, where cooperation is, in fact, rational. Also in many cases the experiment, which is intended 
to be a one-shot or a finitely repeated game is in fact perceived by the players - despite all contrary efforts 
by the experimenter - as being part of a larger, practically infinite game (say, the players' everyday life in 
its full complexity) where other issues like future threats or promises, or reputation matter, and strategies 
observed in the experiment are in fact rational in this supergame sense. 

Contrary to what is predicted for finitely repeated games, infinitely repeated games can behave differently. 
Indeed, it was found that cooperation can be rational in infinitely repeated games. A colle ction of th e orems 
formalizing this result is usually re f erred to as the Folk TheoremPl General results bv iFriedmanl ( 197ll ) 
and later bv iFudenberg and Maskinl ( 19861 ) imply that if the discount factor is sufficiently close to 1, i.e., if 
players are sufficiently patient, Pareto efficient subgame perfect Nash equilibria (among many other possible 
equilibria) can be reached in an infinitely repeated game G(oo,S). In case of the Prisoner's Dilemma, for 
instance, this means that cooperation can be rational if the game is infinite and 5 is sufficiently large. 

When restricted to the Prisoner's Dilemma the proof is based on considering unforgiving ("trigger") 
strategies like "Grim Trigger" . Grim Trigger starts by cooperating and cooperates until the opponent defects. 
From that point on it always defects. The strategy profile of both players playing Grim Trigger is a subgame 
perfe ct NE provided that S is larger than some well defined lower bound. See e.g. Gibbons' book (jGibbonsl . 



19921 ) for a readable account. Clearly, cooperation in this case stems from the fear from an infinitely long 



punishment phase following defection, and the temptation to defect in a given period is suppressed by the 
threatening cumulative loss from this punishment. When the discount factor is small the threat diminishes, 
and the cooperative behavior evaporates. 

Another interesting way to explain cooperation in finitely repeated social dilemmas is to assume that play- 
ers are not fully rational or that their rationality is not common knowledge. Since cooperation of boundedly 
rational agents is the major theme of subsequent sections, we only discuss here the second possibility, that 
is when all players are rational but this is not common knowledge. Each player knows that she is rational, 
but perceives a chance that the co-players are not. This information asymmetry makes the game a different 
game, namely a finitel y repeated game b ased on a stage game with incomplete information. As was shown 
in a seminal paper bv lKreps et al. (Il982h . such an information asymmetry is able to rescue rational game 
theory, and assure cooperation even in the finitely repeated Prisoner's Dilemma, provided that the game 
is long enough or the chance of playing with a non-rational co-player is high enough, and the stage game 
payoffs satisfy some simple inequalities. The crux of the proof is that there is an incentive for rational players 
to mimic non-rational players and thus create a reputation (of being non-rational, i.e., cooperative) for which 
the other's best reply is cooperation - at least sufficiently far from the last stage. 



7 The name "Folk Theorem" refers to the fact that some of these results were common knowledge already in the 1950s, even 
though no one had published them. Although later more specific theorems were proven and published, the name remained. 
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3. Evolutionary games: population dynamics 

Evolutionary game theory is the theory of dynamic adaptation and learning in (infinitely) repeated games 
played by boundcdly rational agents. Although nowadays evolutionary game theory is understood as an 
intrinsically dynamic theory, it originally started in the 70s as a novel static refinement concept for Nash 
equilibria. Keeping the historical order, we will first discuss the concept of an evolutionarily stable strategy 
(ESS) and its extensions. The ESS concept investigates the stability of the equilibrium under rare mutations, 
and it does not require the specification of the actual underlying game dynamics. The discussion of the various 
possible evolutionary dynamics, their behavior and relation to the static concepts will follow afterwards. 

In most of this section we restrict our attention to so-called "population games" . Due to a number of sim- 
plifying assumptions related to the type and number of agents and their interaction network, these models 
give rise to a relatively simple, aggregate level description. We delegate the analysis of games with a finite 
number of players or with more complex social structure into later sections. There statistical fluctuations, 
stemming from the microscopic dynamics or from the structure of the social neighborhood cannot be ne- 
glected. These latter models are more complex, and usually necessitate a lower level, so-called agent-based, 
analysis. 

3.1. Population games 

A mean-field or population game is defined by the underlying two-player stage game, the set of feasible 
strategies (usually mixed strategies are not allowed or are strongly restricted), and the heuristic updating 
mechanism for the individual strategies (update rules). The definition tacitly implies the following simplifying 
assumptions: 

(i) The number of boundedly rational agents is very large, TV — > oo; 

(ii) All agents are equivalent and have identical payoff matrices (symmetric games), or agents form two 
different but internally homogeneous groups for the two roles (asymmetric games) PH 

(iii) In each stage game (round) agents are randomly matched with equal probability (symmetric games), 
or agents in one group are randomly matched with agents in the other group (asymmetric games), 
thus the social network is the simplest possible; 

(iv) Strategy updates are rare in comparison with the frequency of playing, so that the update can be 
based on the average success rate of a strategy. 

(v) All agents use the same strategy update rule. 

(vi) Agents are myopic, i.e., their discount factor is small, 5 — ► 0. 

In population games the fluctuations, arising from the randomness of the matching procedure, playing mixed 
strategies (if allowed), or from stochastic update rules, average out and can be neglected. For this reason 
population games comprise the mean-field level of evolutionary game theory. 

These simplifications allow us to characterize the overall behavioral profile of the population, and hence 
the game dynamics, by a restricted number of state variables. 

Matrix games 

Let us consider first a symmetric game G — (A, A T ). Assume that there are N players, n = 1,...,N, 
each playing a pure strategy s n from the discrete set of feasible strategies s n E S = {ei, e2, . . . , eg}. It is 
convenient to think about strategy as a Q component unit vector, whose ith component is 1, the other 
components are zero. Thus e.; points to the ith corner of the strategy simplex Aq. 

Let Ni denote the number of players playing strategy ej. At any moment of time, the state of the population 
can be characterized by the relative frequencies (abundances, concentrations) of the different strategies, i.e., 
by the Q dimensional state vector p: 



8 In the asymmetric case population games are sometimes called two-population games to emphasize that there is a separate 
population for each possible role in the game. 
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Clearly, ^ pi = ^^Ni/N — 1, i.e., p <G Aq. Note that p can take any point on the simplex Aq even 
though players are only allowed to play pure strategies (the corners of the simplex) in the game. Saying it 
differently, the average strategy p may not be in the set of feasible strategies for a single player. 

Payoffs can be expressed as a function of the strategy frequencies. The (normalized) expected payoff of 
player n playing strategy s n in the population is 



u n {s n ,p) 



1 N 
N E Sn 



As r 



Ap, 



(25) 



m— 1 



where the sum is over all players m of the population except n, but this omission is negligible in the infinite 
population limit[f] Equation (25) implies that for a given player the unified effect of the other players of 
the population looks as if she would play against a single representative agent who plays the population's 
average strategy as a mixed strategy. 

Despite the formal similarity, p is not a valid mixed strategy in the game, but the population average of 
pure strategies. It is not obvious at first sight how to cope with real mixed strategies when they are allowed 
in the stage game. Indeed, in case of an unrestricted S x S matrix game, there are S pure strategies and 
an infinite number of mixed strategies: each point on the simplex As represents a possible mixed strategy 
of the game. When all these mixed strategies are allowed, the state vector p is rather a "state functional" 
over As- However, in many biological and economic applications the nature of the problem forbids mixed 
strategics. In this case the number of different strategy types in the population is simply Q = S. 

Even if mixed strategies are not forbidden in theory, it is frequently enough to consider one or two well- 
prepared mixed strategies in addition to the pure ones, for example, to challenge evolutionary stability 
(see later). In this case the number of types Q (> S) can remain finite. The general recipe is to convert the 
original game which has mixed strategics to a new (effective) game which only has pure strategics. Assuming 
that the underlying stage game has S pure strategies, and the construction of the game allows, in addition, 
a number R of well-defined mixed strategies as combinations of these pure ones, we can always define an 
effective population game with Q = S + R strategies, which are then treated on equal footing. The new game 
is associated with an effective payoff matrix of dimension Q x Q, and from that point on, mixed strategies 
(as linear combinations o f these Q strateg ies) are formally forbidden. 

As a possible example ( Cressmaxl 120031 ) . consider population games based upon the 5 = 2 matrix game in 
Eq. (11). If the population game only allows the two pure strategics e\ = (1, 0) T and e-i — (0, 1) T , the payoff 
matrix is that of Eq. (11), and the state vector, representing the frequencies of e.\ and e2 agents, lies in 
A2. However, in the case when some mixed strategies are also allowed such as, for example, a third (mixed) 
strategy e$ = (1/2,1/2) T , i.e., playing e\ with probability 1/2 and ei with 1/2, the effective population 
game becomes a game with Q — 3. The effective payoff matrix for the three strategies ei,e2, and e% now 
reads 



/ a 



A = 



a + b 
2 

c + d 



a + c b + d a + b + c + d 



(26) 



9 This formulation assumes that the underlying game is a two-player matrix game. In a more general setup (non-matrix games, 
multi-player games) it is possible that the utility of a player cannot be written in a bilinear form containing her strategy and 
the mean strategy of the population as in Eq. (25), but is a more general nonlinear function, e.g., u = s ■ f(p) with / nonlinear. 
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Fig. 2. Schematic connectivity structure for (a) population (symmetric) and (b) two-population (asymmetric) games. Black 
and white dots represent players in different roles. 

It is also rather straightforward to formally extend the concept of a Nash equilibrium, which was only 
defined so far on the agent level, to the aggregate level, where only strategy frequencies are available. A 
state of the population p* £ Aq is called a Nash equilibrium of the population game, iff 

p*Ap*>p-Ap* VpeA Q . (27) 

Note that NEs of a population game are always symmetric equilibria of the two-player stage game. Asym- 
metric equilibria of the stage game like those appearing in the Anti-Coordination Class in Eq. (15) cannot 
be interpreted at the population level. 

It is not trivial that the definition in Eq. (27) is equivalent to the agent-level (microscopic) definition of 
an NE, i.e., given the average strategy of the population, no player has a unilateral incentive to change her 
strategy. We delegate this question to Sec. 3.2, where we discuss evolutionary stability. 



Bi-matrix games 

In the case of asymmetric games (bi-matrix games) G = (A, B T ) the situation is somewhat more com- 
plicated. Predators only play with Prey, Owners only with Intrudes, Buyers only with Sellers. The simplest 
model along the above assumptions is a two-population game, one population for each possible kind of play- 
ers. In this case random matching means that a certain player of one population is randomly connected with 
another player from the other population, but never with a fellow agent in the same population (see Fig. 
2 for an illustration). The state of the system is characterized by two state vectors: p £ Aq for the first 
population and rj £ An for the second population. Obviously, symmetrized versions of asymmetric games 
(role games) can be formulated as one-population games. 

For two-population games the Nash equilibrium is a pair of state vectors (p*,rj*) such that 

p* ■ Arf > p ■ At]* VpeAg, 

rf ■ Bp* >r) - Bp* Vr/eAfl. (28) 



3.2. Evolutionary stability 



A central problem of evolutionary game theory is the stability and ro bustness of strategy profile s in a 
population. Indeed, according to the theory of "punctuated equilibrium" ( Gould and Eldredge . 19931 ) (bio- 
logical) evolution is not a continuous process, but is characterized by abrupt transition events of speciation 
and extinction of relatively short duration, which separate long periods of relative tranquility and stability. 
On one hand, punctuated equilibrium theory explains missing fossil records related to "intermediate" species 
and the documented coexistence of freshly branched species, on the other hand it implies that most of the 
biological structure and behavioral pattern we observe around us, and aim to explain, is likely to possess a 
high degree of stability. In this sense a solution provided by a game theory model can only be a plausible 
solution if it is "evolutionarily" stable. 



Evolutionarily stable strategies 
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The first concept of evolutionary stability was formulated by iMavnard Smith and Price (|l973l) in the 



context of symmetric population games. An evolutionarily stable strategy (ESS) is a strategy that, when 
used by an entire population, is immune against invasion by a minority of mutants playing a different 
strategy. Players playing according to an ESS fares better than mutants in the population and thus in 
the long run outcompete and expel invaders. An ESS persists as the dominant strategy over evolutionary 
timescales, so strategies observed in the real world are typically ESSs. The ESS concept is relevant when the 
mutation (biology) or experimentation (economics) rate is low. In general evolutionary stability implies an 
invasion barrier, i.e., an ESS can only resist invasion until mutants reach a finite critical frequency in the 
population. The ESS concept does not contain any reference to the actual game dynamics, and thus it is a 
"static" concept. The only assumption it requires is that a strategy performing better must have a higher 
replication (growth) rate. 

In order to formulate evolutionary stability, consider a matrix game with 5* pure strategies and all possible 
mixed strategics composed from these pure strategies, and a population in which the majority, 1 — e part, of 
the players play the incumbent strategy p* € Ag, and a minority, e part, plays a mutant strategy p € Ag. 
Both p* and p can be mixed strategies. The average strategy of the population, which determines individual 
payoffs in a population (mean-field) game, is p = (1 — e)p* + ep G Ag. The strategy p* is an ESS, iff for all 
e > smaller than some appropriate invasion barrier e, and for any feasible mutant strategies p £ Ag, the 
incumbent strategy p* performs strictly better in the mixed population than the mutant strategy p, 

u{p*,p)>u(p,p), (29) 



p* ■ Ap > p ■ Ap. (30) 

If the inequality is not strict, p* is usually called a weak ESS. The condition in Eq. (30) takes the explicit 
form 

p* ■ A[(l - e)p* + ep] >p-A[(l-e)p* + ep], (31) 
which can be rewritten as 

(1 - e)(p* ■ Ap* - p ■ Ap*) + e{p* ■ Ap p ■ Ap) > 0. (32) 

It is thus clear that p* is an ESS, iff two conditions are satisfied: 

(1) NE condition: 

p* Ap*>p-Ap* for allpe A^, (33) 

(2) stability condition: 

if P 7^ P* an d p* ■ Ap* = p ■ Ap* , 

then p* ■ Ap > p ■ Ap. (34) 

According to (1) p* should be a symmetric Nash equilibrium of the stage game, hence the definition in Eq. 
(27), and according to (2) if it is a non-strict NE, then p* should fare better against p than p against itself. 
Clearly, all strict symmetric NEs are ESSs, and all ESSs are symmetric NEs, but not the other way around 
[see the upper panel in Fig. 3]. Most notably not all NEs are evolutionary stable, thus the ESS concept 
provides a means for equilibrium selection. However, a game may have several ESSs or no ESS at all. 

As an illustration consider the Hawk-Dove game defined in Appendix A. 4. The stage game belongs to the 
Anti-Coordination Class in Eq. (15) and has a symmetric mixed strategy equilibrium: 
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Strict Nash equilibria 



Evolutionary stable strategies 



Nash equilibria 
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(matrix) 
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(bimatrix) 



Fig. 3. Relation between evolutionary stability (ESS) and symmetric Nash equilibria for matrix and bi-matrix games. 

which is an ESS. Indeed, we find 



V V 2 

Vp £ A 2 : p ■ Ap* = - - — 



(36) 



independent of p. Thus Eq. (33) is satisfied for all p with equality. Hence we should check the stability 
condition Eq. (34). Parameterizing p as p = (p, 1 — p) T , we find 



(V - vC) 2 V 
p* Ap-p-Ap= y 2 £ > > 0, Vp^-, 



(37) 



thus Eq. (34) is indeed satisfied, a nd p* is an ESS. It is in fact th e o nly ESS of the H awk-Dove game. 

An important theorem [see, e.g.. iHofbauer and Sigmund (|l998f) or lCressmanl (|2003|) ] says that a strategy 
p* is an ESS, iff it is locally superior, i.e., p* has a neighborhood in Ag such that for all p ^ p* in this 
neighborhood 



p* ■ Ap > p ■ Ap. 



(38) 



In the Hawk-Dove game example, as is shown by Eq. (37), local superiority holds in the whole strategy 
space, but this is not necessary in other games. 

Evolutionarily stable states and sets 

Even if a game has an evolutionarily stable strategy in theory, this strategy may not be feasible in practice. 
For instance, Hawk and Dove behavior may be genetically coded, and this coding may not allow for a mixed 
strategy. Therefore, when the Hawk-Dove game is played as a population game in which only pure strategics 
are allowed, there would be no feasible ESS. Neither pure Hawk nor pure Dove, the only individual strategics 
allowed, are evolutionarily stable. There is, however, a stable composition of the population with strategy 
frequencies p* = p* given by Eq. (35). This state satisfies both the ESS conditions Eqs. (33) and (34) (with 
p substituted everywhere by p) and the equivalent local superiority condition Eq. (38). It is customary to 
call such a state an evolutionarily stable state (also abbreviated traditional ly as ESS), which is the direc t 
extension of the ESS concept to population games with only pure strategies (jHofbauer and Sigmundl . 119981 ) . 

The extension of the ESS concept to composite states of a population would only have a sense, if these 
states had similarly strong stability properties as evolutionarily stable strategies. As we will see later in Sec. 
3.3, this is almost the case, but there are some complications. Because of the extension, the ESS concept loses 
some of its strength. Although it remains true that ESSs are dynamically stable, the converse loses validity: 
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Fig. 4. An ESset (line along the black dots) for the degenerate Hawk-Dove game Eq. (26) with a = —2, b = c = 0,d = — 1. 



a state should not necessarily be an ESS for being dynamically stable and thus persisting for evolutionary 
times. 

Up to now we have discus sed the properties of ESSs, b ut how to find an ESS? One possibility is to use the 
Bishop-Cannings Theorem ( Bishop and Canningsl 1978t ) which provides a necessary condition for an ESS: 



If p* is a mixed evolutionarily stable strategy with support / = {e l5 e 2 , . . . , e^.}, i.e., p* = 5Zj=i Pi e i with 
Pi > VI < i < k, then 

u(ei,p*) = u(e 2 ;p*) = ...= u(e k :p*) = u(p*:p*), (39) 

which leads to Eq. (17), the condition of an interior NE, when payoffs are linear. Bishop and Canning have 
also proved that if p* is an ESS with support / and rj* ^ p* is another ESS with support J, then neither 
support set can contain the other. Consequently, if a matrix game has an interior ESS, than it is the only 
ESS. 

The ESS concept has a further extension towards evolutionarily stable sets (ESset), which becomes useful 
in games with singular payoff matrices. An ESset is an NE component (a connected set of NEs) which is 
stable in the following sense: no mutant can spread when the incumben t is using a strategy in the ES set, 
and mutants using a strategy not belonging to the ESset are driven out (jBalkenborg and Schlad . ll995l) . In 



fact each point in the ESset is neutrally stable, i.e., when the incumbent and the mutant are also elements 
of the set they fare just equally in the population. 

As a possible example consider again the game defined by the effective payoff matrix Eq. (26). If the basic 
2-strategy game has an interior NE, p* = (p*,?^) 7 ', the 3-strategy game has a boundary NE (pl,p%,Q) T . 
Then the line segment E = {p € A3 : pi + P3/2 = p\] is an NE compone nt (see Fig. 4). It is an ESset, iff 
the 2-strategy game belongs to the Anti-Coordination (Hawk-Dove) Class ( Cressmanl . 120031 ) . 



Bi-matrix games 

The ESS concept was originally invented for symmetric games, and later extended for asymmetric games. 
In many respects the situation is easier in the latter. Bi-matrix games are two-population games, and thus 
a mutant appearing in one of the populations never plays with fellow mutants in her own population. 
In technical terms this means that the condition (2) in Eq. (34) is irrelevant. Condition (1) is the Nash 
equilibrium condition. When the NE is not strict there exists a mutant strategy in at least one of the 
populations which is just as fit as the incumbent. These mutants are not driven out although they cannot 
spread either. This kind of neutral stability is usually called drift in biology: the composition of the population 
ch anges by pure chance. Strict NEs, on the other hand, are always evolutionarily stable and as was shown 
bv lSeltenl (|l980l) : In asymmetric games a strategy pair is an ESS iff it is a strictNE (see the lower panel in 
Fig. 3). 
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Since interior NEs are never strict it follows that asymmetric games can only have ESSs on bd(Ag x A#). 
Again the number of ESSs can be high and there are games, e.g., the asymmetric, two-population version 
of the Rock-Scissors-Paper game, where there is no ESS. 

Similarly to the ESset con cept for symmetric games, the concept of strict equilibrium sets (SEset) can be 
introduced ( Cressman . l2003h . An SEset F € Aq x A^ is a set of NE strategy pairs such that (p, rj*) £ F 



whenever p ■ Ar\* = p* ■ Arf and (p*,rj) £ F whenever rj ■ Bp* = rf ■ Bp* for some (p* , rf) £ F. Again, 
mutants outside the SEset are driven out from the population, while those within the SEset cannot spread. 
When mutations appear randomly, the composition of the two populations drifts within the SEset. 

3.3. Replicator dynamics 

A model in evolutionary game theory is made complete by postulating the game dynamics, i.e., the rules 
that discribe the update of strategies in the population. Depending on the actual problem, different kinds of 
dynamics can be appropriate. The game dynamics can be continuous or discrete, deterministic or stochastic, 
and within these major categories a large number of different rules can be formulated depending on the 
situation under investigation. 

On the macroscopic level, by far the most studied continuous ev olutionary dynamics is the replicator 
dynamics. It was introduced originally by iTavlor and Jonkerl (|1978l ). and it has exceptional status in the 



models of biological evolution. On the phcnomenological level the replicator dynamics can be postulated 
directly by the reasonable assumption that the per capita growth rate pi/pi of a given strategy type is 
proportional to the fitness difference 

— = fitness of type i — average fitness. (40) 
Pi 

The fitness is the individual's evolutionary success, i.e., in the game theory context the payoff of the game. 
In population games the fitness of strategy i is (Ap)i, whereas the average fitness is p ■ Ap. This leads to 
the equation 

Pi =pi((Ap)i p- Ap). (41) 

which is usually called the Taylor form of the replicator equation. 

Under slightly different assumptions the replicator equation takes the form 

{Ap) t - p - Ap 

Pi = Pi -. , (42) 

p ■ Ap 

which is the so-called Maynard Smith form (or adjusted replicator equation). In this the driving force is the 
relative fitness difference. Note that the denominator is only a rescaling of the flow velocity. For both forms 
the simplex Aq is invariant and such are all of its faces: if the initial condition does not contain a certain 
strategy i, i.e., pt(t = 0) = for some i, then it remains Pi(t) = for all t. The replicator dynamics does 
not invent new strategies, and as such it is the prototype of a wider class of dynamics called non-innovative 
dynamics. Both forms of the replicator equations can be deduced rigorously from microscopic assumptions 
(see later). 

The replicator dynamics lends the notion of evolutionary stability, originally introduced as a static concept, 
an explicit dynamic meaning. From the dynamical point of view, fixed points (rest points, stationary states) 
of the replicator equation play a distinguished role since a population which happens to be exactly in the 
fixed point at t = remains there forever. Stability against fluctuations is, however, a crucial issue. It is 
customary to introduce the following classification of fixed points: 

(a) A fixed point p* is stable (also called Lyapunov stable) if for all open neighborhoods U of p* (whatever 
small it is) there is another open neighborhood O C U such that any trajectory initially inside O remains 
inside U. 

(b) The fixed point p* is unstable if it is not stable. 
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(c) The fixed point p* is attractive if there exists an open neighborhood U of p* such that all trajectory 
initially in U converges to p* . The maximum possible U is called the basin of attraction of p* . 

(d) The fixed point p* is asymptotically stable (also called attractor) if it is stable and attractive ! 10 1 In 
general a fixed point is globally asymptotically stable if its basin of attraction encompasses the whole space. 
In the case of the replicator dynamics where bdAQ is invariant, a state is called globally asymptotically 
stable if its basin of attraction contains intAQ. 

These definition can be trivially extended from fixed points to invariant sets such as a set of fixed points 
or limit cycles, etc. 



Dynamic vs. evolutionary stability 



What is the connection between dynamic stability and evolutionary stability? Unfortunately, the two 
concepts do not perfectly overlap. The actual relationship is best summarized in the form of two collections 
of theorems: one relating to Nash equilibria, the other to evolutionary stability. As for Nash equilibria in 
matrix games under the replicator dynamics the Folk Theorem of Evolutionary Game Theory asserts: 

(a) NEs are rest points. 

(b) Strict NEs are attractors. 

(c) If an interior orbit converges to p* , then it is an NE. 

(d) If a rest point is stable then it is an NE. 

See Irlofbauer and Sigmun dl (|l998l ): ICressmanl (|2003h : iHofbauer and Sigmundl (j2003h for detailed discussions 
and proofs. 

None of the converse statements hold in general. For (a) there can be rest points which are not NEs. These 
can only be situated on bd Aq . It is easy to show that a boundary fixed point is only an NE iff its transversal 
eigenvalues (associated with e igenvectors of the Jacobian transverse to the boundary) are all non-positive 
(jHofbauer and Sigmun dl. l2003h . An interior fixed point is trivially an NE. 



As for (b), not all attractors are strict NEs. As an example consider the matrix ijHofbauer and Sigmund 
19981: IZeemanlll98d) 
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(43) 



with the associated flow diagram in Fig. 5. The rest point [1/3,1/3,1/3] is an attractor (the eigenvalues 
have negative real parts), but being in the interior of A3 it is a non-strict NE. 

It is easy to construct examples in which a point is an NE, but it is not a limit point of an interior orbit. 
Thus the converse of (c) does not hold. For an example of dynamically unstable Nash equilibria consider 
the following family of games (jCressmanl . 120031 ) 



A = 



/ 



2-q 4 \ 
6 0-4 
-2 8- a J 



(44) 



These games are generalized Rock-Scissors-Paper games which have the cyclic dominance property for 
2 < a < 8. Even beyond this interval, for — 16 < a < 14, there is a unique symmetric interior NE, 



10 Note that a fixed point p* can be attractive without being stable. This is the case if there are trajectories which start close 
to p* , but take a large excursion before converging to p* (see an example later in Sec. 6.10). Being attractive is not enough to 
classify as an attractor, stability is also required. 
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Fig. 5. Flow pattern for the game A.f 1 in Eq. 43. Red (blue) colors indicate fast (slow) flo w. Black (white) circles are stab le 
(unstable) rest points. Figure made by the game dynamics simulation program "Dynamo" llSandholm and Do kumaci . 2006). 



p* = [28 - 2a, 20, 16 4 
point is attractive for 



a]/ (64 — a), which is a rest point of the replicator dynamics (see Fig. 6). This rest 
-16 < a < 6.5 but repulsive for 6.5 < a < 14. 




Fig. 6. Trajectories predicted by Eq. (44) for a = 5.5, 6.5, and 7.5 ( f: 
Figures made by the "Dynamo" program ISandholm and Dokumacil (2006). 

The main results for th e subtle relation between the r eplicator dynam i cs and evolutionary stability c an 
be summarized as follows (jHofbauer and Sigmund . 1998; Cressman . l2003t iHofbauer and Sigmundl . 120031 ) : 



(a) ESSs are attractors. 

(b) interior ESSs are global attractors. 

(c) For potential games a fixed point is an ESS iff it is an attractor. 

(d) For 2x2 matrix games a fixed point is an ESS iff it is an attractor. 

Again, the converses of (a) and (b) do not necessarily hold. To show a counterexample where an attractor 



26 



Coordination Anti-Coordination Pure Dominance 




Fig. 7. Classification of two-strategy (symmetric) matrix games based on the replicator dynamics flow. Full circles are stable, 
open circles are unstable NEs. 

is not an ESS consider again the payoff matrix A-j x in Eq. (43) and Fig. 5. The only ESS in the game is 
ei = [1,0,0] on bdA3. The NE [1/3,1/3,1/3] is an attractor but not evolutionarily stable. Recall that the 
Bishop-Cannings theorem forbids the occurrence of interior and boundary ESSs at the same time. 

Another example is Eq. (44), where the fixed point p* is only an ESS for —16 < a < 3.5 At a = 3.5 it is 
neutrally evolutionarily stable. For 3.5 < a < 6.5 the fixed point is dynamically stable but not evolutionarily 
stable. 



Classification of matrix game phase portraits 



The complete classification of possible replicator phase portraits for two-strategy games, where A2 is 
one-dimensional, is easy and goes along with the classification of Nash equilibria discussed in Sec. 2.4. The 
three possible classes are shown in Fig. 7. In the Coordination Class the mixed strategy NE is an unstable 
fixed point of the replicator dynamics. Trajectories converge to one or the other pure strategy NEs, which 
are both ESSs (cf. theorem (d) above). For the Anti-Coordination Class the interior (symmetric) NE is the 
global attractor thus it is the unique ESS. In case of the Pure Dominance Class there is one pure strategy 
ESS, which is again a global attractor. 

A s imilar complete classification in the case of three-strategy matrix games is more tedious ( Hofbauer and Sigmundl . 
2003 ). In the generic case there are 19 different phase portraits for the repli cator dynamics dZeemanl . 1980 ). 



1981 ll995j). In general 



and this number rises to about fifty when degenerate cases are included (jBomze 
Zeeman's classification is based on counting the Nash equilibria: there can be at most one interior NE, three 
NEs on the boundary faces, and three pure strategy NEs. An NE can be stable, unstable or a saddle point, 
but topological rules only allow a limited combinations of these. Four possibilities from the 19, 
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are shown in Fig. 8 (other c l asses have been show in Figs. 5 and 6). The subscript of A denotes the Zeeman 
classification code ( Zeeman , 1980l) . In degenerate cases exten ded NE compo nents can appear. 

An important result is that no isolated limit cycles exist (|Zeemanl . [l980T ). Limit cycles can exist in de- 
generate cases but then they occur in non-isolated families like for the Rock-Scissor-Paper game in Eq. (44) 
with a = 6.5 as shown in Fig. 6. 

Isolated limit cycles can exist in four-strategy ga mes (and above), and there nu merical simulations also 
indicate the possible presence of chaotic attractors (jHofbauer and Sigmundl . 120031 ) . A complete taxonomy 
of all possible behaviors (topologically distinct phase portraits) for more than three strategies seems rather 
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Fig. 8. Some possible flow diagrams for three-strategy games. Upper left: Zeeman's code 8; upper right: 72; lower left: 10i; 
lower right: —102- Red (blue) colors indicate fast (slow) flow. Black (white) circles are stable (unstable) rest points. Figures 
made by the simulation program "Dynamo" llSandholm an d Dokumaci, 2 0061) . 

hopeless and has not yet been given. 

Classification of bi-matrix game phase portraits 

The generalization of the replicator dynamics for bi-matrix games is rather straightforward 



Pi = Pi [(Ar))i - p ■ Arj] , 
f)i = 77, [{Bp)i - r] Bp] 



(45) 



where p and r\ denote the state of the two populations, respectively. (The Maynard Smith form in Eq. 
(42) can be generalized similarly.) The relation between dynamic and evolutionary stability is somewhat 
easier for bi-matrix than for matrix games. Wc have already seen that ESSs are strict NEs and vice versa. 
As a set-wise extension of bi-matrix evolutionary stability we have introduced SESets. It turns out that 
these concepts are exactly equivalent to asymptotic stability: In bi-matrix games a set of rest poin ts of the 
bi-matrix replicator dynamics Eq. (4-5) is an attractor if and only if it is a SESet (|Cressmanl 120031 ). 

It can also be proven that the bi-matrix replicator flow is incompressible, thus there can be no interior 
attractor. Consequently, ESSs, if any, are necessarily pure strategies in bi-matrix games. 

As an example consider the Owner-Intruder game, the bi-matrix (two-population) version of the traditional 
Hawk-Dove game. All players in this game have a tag, they are either Owners or Intruders (of, say, a 
territory). Both Owners and Intruders and can play two strategies: Hawk or Dove. The payoff matrix is 
equal to that of the Hawk-Dove game [see Eq. (A. 2)], but the social connectivity is such that Owners only 
play with Intruders and vice versa, thus it is a two-population game with a connectivity structure shown 
in Fig. 2(b). In fact this is the simplest possible deviation from mean-field connectivity in the Hawk-Dove 
game, and as is shown by the flow diagram in the upper left part of Fig. 9 this has strong consequences. 
On the (p, rf) plane, where p {rf) is the ratio of Owners (Intruders) playing the Hawk strategy, the interior 
fixed point, which was an attractor for the original Hawk-Dove game, becomes now a saddle point. Losing 
stability it is no longer an ESS. Instead the replicator dynamics flows towards two possible pure strategy 
pairs: either all Owners play Hawk and all Intruders play Dove, i.e., [p*,7]*) = (1,0) or vice versa, i.e., 



2<S 



{p* ,rf) — (0, 1). These strategy pairs, and only these, are evolutionarily stable. 




Fig. 9. Archetypes of bi-matrix phase portraits in two-population games with the replicator dynamics. The players have two 
options: T or B for the row player, and L or R for the column player. Upper left: Saddle Class, upper right: Center Class, lower 
left: Corner Class, lower right: a degenerate case with a = 0. Red (blue) colors indicat e fast (slow) flow. Black (white) circles 
are stable (unstable) rest points. Figures made by the simulation program "Dynamo" (Sandholm and Dokumaci, 2006). 

It is possible to give a complete classificatio n of the different phase portraits of bi-matrix games where 
both players have two possible pure strategies (jCressmanl . 120031) . Since the replicator dynamics is invariant 
for adding constants to the columns of the payoff matrices and interchanging strategies the bi-matrix can 
be parameterized as 



(A,B 



| (a,a) (0,0) 
(0,0) (d,S) 



An interior NE exists if and only if ad > and a5 > 0, and reads 
5 d 



a + S a + d 



(46) 



(47) 



There are three generic cases: 

Saddle Class [a, d, a, S > 0]. The interior NE is a saddle point singularity. It is not an ESS. There arc 
two pure strategy attractors in opposite corners (both pairs are possible), which are ESSs. The bi-matrix 
Coordination game and the Owner-Intruder game belong to this class. See the upper left panel of Fig. 9 
for an illustration. 

Center Class [a, d > 0, a, S < 0] There is only one NE, the interior NE. It is a center-type singularity. 
There are no ESSs. On the boundary the trajectory form a heteroclinic cycle. The trajectories do not 
converge to an NE. The Buyer-Seller game or the Matching Pennies are typical examples. See the upper 
right part of Fig. 9. 
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Corner Class [a < < d. 5 > 0, a ^ 0] There is no interior NE. All trajectories converge to a unique ESS 
in the corner. The Prisoner's Dilemma game when played as a two-population game is in this class. See 
the lower left part of Fig. 9. 

In addition to the three generic cases there are degenerate cases when one or more of the four parameters 
are zero. None of these contains an interior singularity, but many of these contai n an extended NE component 



which may be, but not necessarily, a SESet. We refer the interested reader to ICressmanl (|2003l ) for details. 
Practically interesting games like the Centipede game of Length Two or the Chain Store game belongs to 
these degenerate cases. The latter is defined by the payoff matrix 

<^>=( ( °' 4) M V (48, 
((2,2) (-4,-4) ^ 

and the associated replicator flow diagram with a SESet is illustrated in the lower right panel of Fig. 9. 

3.4. Other game dynamics 

A large number of different population-level dynamics is discussed in the game theory literature. These 
can be either derived rigorously from microscopic (i.e., agent-based) strategy update rules in the large 
population limit (see Sec. 4.4), or they are simply posited as the starting point of the analysis on the aggregate 
(population, macro) level. Many of these share important properties with the repl icator dynamics, others 
behav e quite differently An excellent recent review on the various game dynamics is IHofbauer and Sigmundl 



(120031 ) 



There are two important dimensions along which macro (aggregate) dynamics can be classified: (1) being 
innovative or non-innovative, and (2) being payoff monotone or non-monotone. Innovative strategies have 
the potential to introduce new strategies not currently present in the population, or revive formerly extinct 
ones. Non- innovative dynamics maintain (sometimes diminish) the support of the strategy space. Payoff 
monotonicity, on the other hand, refers to the relative speed of the change of strategy frequencies. A dynamics 
is payoff monotone if for any two strategies i and j, 

ei > Pi^ Ui (p)>u j (p), (49) 
Pi Pj 

i.e., iff at any time strategies with higher payoff proliferate more rapidly than those with lower payoff. By 
these definitions, the Replicator dynamics is non-innovative and payoff monotone. Although some dynamics 
may not share all the qualitative properties of the replicator dynamics, it can be shown that wh e n a dy namic 
is payoff monotone, the Folk theorem, discussed above, remains valid ( Hofbauer and Sigmund . [20031 ). 



Best Response Dynamics 

A typical innovative dynamics is the Best Response dynamics: in each moment a small fraction of the 
population updates her strategy, and chooses her best response (BR) to the current state p of the system, 
leading to the dynamical equation 

p = BR{p) - p. (50) 

Usually the game is such that in a large domain of the strategy simplex the best response is a unique (and 
hence pure) strategy (3. Then the solution of Eq. (50) is a linear orbit 

p(t) = (l-e~ t )(3 + e- t p , (51) 

where p is the aggregate state of the population at t = 0. The solution tends towards /3, but this is only 
valid up to a finite time t' , when the actual best response suddenly jumps to another strategy . From this 
time on the solution is another linear orbit tending towards /3 , again up to another singular point, and so 
on. The overall trajectory is composed of linear segments. 
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The Best Res ponse dynamics can produce q ualitatively different behavior from the Replicator equation. 
As an example ( Hofbauer and Sigmundl . 120031 ) . consider the trajectories for the Rock-Scissors-Paper-type 
game 



A = 



0-16 
6 0-1 
-16 



(52) 



with 6 = 0.55, as depic ted in Fig. 10. This shows the appearance of a limit cycle, the so-called Shapley 
triangle (|Shaplevl . Il964f ) for the Best Response dynamics, but not for the Replicator equation. 



i 





Fig. 10. Trajectories predicted by Eqs. (41), (50) and (53) for the game Eq. (52) with b = 0.55. Upper panel: Replicator equation 
with a repulsive fixed point; lower left: Best Response dynamics with a stable limit cycle; lower right: Logit dynamics with 
K = 0.1 showing an attractive fixed point. 

Logit Dynamics 

A generalization of the B est Response dynamics for bou nded rationality is the Logit dynamics (Smoothed 
Best Response) , defined as (jFudenberg and Levind . Il998l ) 



Pi 



exp [uj(p)/K} 
£ 7 -c- x P [ Uj (p)/K] 



(53) 



In the limit when the noise parameter K — > 0, we get back the Best Response equation (50). Again, the 
Logit dynamics may produce rather different qualitative results than other dynamics. For the game in Eq. 
(52) with 6 = 0.55 the system goes through a Hopf bifurcation when K is varied. There is a critical value 
K c « 0.07 below which the interior fixed point is unstable and a limit cycle develops, whereas above K c the 
fixed point becomes attractive (see Fig. 10). 



Adaptive Dynamics 
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All dynamics discussed so far model how strategics compete with each other on shorter time scales. Given 
the initial composition of the population a possible outcome is that one of the strategies drives out all others 
and prevails in the system. However, when this strategy is not an ESS it is unstable ag ainst the attack of 



some superior mutants (not present orig i nally in the population). Adaptiv e Dynamics ( Metz et al. . 19961 : 



Dieckmann and Metz , 19961 : Geritz et al. , Il997l : iNowak and Sigmundl . 2004) models the situation when the 
mutation rate is low and possible mutants only differ slightly from residents. Under these conditions the 
dynamic behavior is governed by short transient periods when a successful mutant conquers the system, 
separated by long periods of relative tranquility with unsuccessful invasion attempts by inferior mutations. 
By each successful invasion event, whose details are not modeled, the defining parameters of the prevailing 
strategy change a little. The process can be modeled on longer time scales as a smooth dynamics in strategy 
space. It is usually assumed that the flow points into the direction of the most favorable local strategy, i.e., 
the actual monomorphic strategy s(t) of the population satisfies 



du(q, s) 



(54) 



where now u(q, s) is the payoff of a g-strategist in a homogeneous population of s-strategists. Adaptive 
dynamics may not con verge to a fixed point ( limit cycles are possible), and even if it converges, the attractor 
may not be an ESS ( Nowak and Sigmundl . l2004h . Counterintuitively, Adaptive Dynamics can lead to a 
fitness minimum, where a monomorphic p opulation becomes unstable and split into two, providing a theory 
of evolutionary branching and speciation (jGeritz et all Il997t INowak and Sigmundl . 120041 ) . 



4. Evolutionary games: agent-based dynamics 



What we used in the former section for the description of the state of the population was a population- 
level {macro-level, aggregate-level) description. This specified the state of the population and its dynamic 
evolution in terms of a small number of strategy frequencies. Such a macroscopic description is adequate, if 
the social network is mean-field-like and the number of agents is very large. However, when these premises are 
not fulfilled, a more detailed, lower-level analysis is required. Such a lower-level approach is usually termed 
"agent-based", since on this level the basic units of the theory are the individual agents themselves. The 
agent-level dynamics of the system is usually defined by strategy update rules, which describe how the agents 
perceive their surrounding environment, what information they acquire, what believes and expectations they 
form from former experience, and how this all translates into strategy updates during the game. These rules 
can mimic genetically coded Darwinian selection or boundedly rational human learning, both affected by 
possible mistakes. When games arc played on a graph, the update rule may not only concern a strategy 
change alone, but a reorganization of the agent's local network structure, too. 

These update rules can also be viewed as "meta-strategies" , as they represent strategies about strategies. 
The distinction between strategies and meta-strategies is somewhat arbitrary, and is only justified if there is 
a hierarchic relation between them. This is usually assumed, i.e., the given update rule is utilized much more 
rarely in the rep eated game t han t he basic stage-game strategies. A strategy update has low probability 
during the game. Roca et all (2006) have discussed the significant consequences when varying the ratio of 
time scales between payoff refreshment and strategy update. Also, while players can use different strategics 
in the stage games, we usually posit that all players use the same update rule in the population. Note that 
these assumptions may not be justified in certain situations. 

There is a huge variety of microscopic update rules defined and applied in the game theory literature. 
There is no general law of nature which would dictate such behavioral rules: even though we call these rules 
"microscopic", they, in fact, emerge as simplified phcnomcnological rules of an even more fundamental layer 
of mechanisms describing the operation of the human psyche. The actual choice of the update rule depends 
very much on the concrete problem under consideration. 

Strategy updates in the population may be synchronized or randomly sequential in the social graph. 
Some of these rules are generically stochastic, some others are deterministic, sometimes with small stochastic 
components representing random mutations (experimentation). Furthermore, there exist many different ways 
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how the strategy change is determined by the local environment. In many cases the strategy choice for a 
given player depends on the payoff differences of her payoff and her neighbors' payoffs. This difference may 
be determined by a one-shot game between the confronting players (see, e.g., the spatial Rock-Scissors-Paper 
games), by a summation of the stage game payoffs over all neighbors, or perhaps these summed payoffs are 
accumulated over some time with a weighing factor decreasing with the time elapsed. Usually the rules are 
myopic, i.e., optimization is based on the current state of the population without anticipating possible future 
alterations. We will mostly concentrate on memoryless (Markovian) systems, where the evolutionary rule is 
determined by the current payoffs. A well-known exception, w hich we do not discu ss here, is the Minorit y 
game, which was covered in ample detail in two recent books bv lChallet etlll (|2004h and bv lCoolenl (120051 ). 

In the following we consider a system with equivalent players distributed on the site of a lattice (or graph). 
The player at site x follows one of the Q pure strategies characterized by a set of Q-component unit vectors, 



Sx = 



(55) 



The income of player x comes from the same symmetric two-person stage game with her neighbors. In 
this case her total income can be expressed as 



s T • As 



Vi 



(56) 



where the summation runs over agent x's neighbors y £ Q x defined by the connectivity structure. 
For any strategy distribution {s^} the total income of the system is given as 



U = ]T U x = s * ' As v 



(57) 



Notice that for potential games (Ay = Aji) this formula is analogous to the (negative) energy of an Ising- 



typc model. In the simplest case (Q — 2), if Ai 



-Sij then U is equivalent to the Hamiltonian of the 



ferromagnetic Ising model where spins positioned on a lattice can point upward or downward. Within the 
lattice gas formalism the up and down spins are equivalent to occupied and empty sites, respectively. These 
models are widely used to describe ordering processes in two-component solid solutions (jKittell . 120041 ). The 
generalization for hig her Q is str aightforward. The energy of the Q-stdXe Potts model can be reproduced by 
a Q x Q unit matrix (Wu, 1982h . Consequently, for certain types of dynamical rules, evolutionary potential 
games become equivalent to many-particl e systems, whose investigations by the tools of statistical physics 
are very successful [for a textbook see e.g. IChandlerl (|l987l )]. 

The exploration of the possible dynamic update rules has resulted in an extremely wide variety of models, 
which cannot be surveyed completely. In the following, we focus our attention on some of the most relevant 
and/or popular rules appearing in the literature. 



4.1. Synchronized update 



Strategy revision opportunities can arrive synchronously or asynchronously for the agents. In spatial 
models the difference between synchronous and asynchronous update is enh anced, because these processe s 
yield fundamentally different spatio-temporal patterns and general behavior (jHuberman and Glancd . lf993h . 

In synchronous update the whole population updates simultaneously in discrete time steps, giving rise to a 
discrete-time dynamics on the macro level. This is the kind of update used in cellular automata. Synchronous 
update is applied, for instance, in biological models, when generations are clearly discernible in time, or when 
seasonal effects synchronize metabolic and reproductive processes. Synchronous update may be relevant for 
some other systems, as well, where appropriate time delays enforce neighboring players to modify their 
strategy with respect to the same current time surrounding. 

Cellular automata can well represent those evoluti onary games, where the players are located on the sites 
of a lattice. The generalization to arbitrary networks ( Abramson and KupermanL 2001 : Masuda and Aiharal . 
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2003tlDuran and Muletl . 120051 ) is straightforward and not detailed here. At discrete time steps (t = 0, 1, 2, . . .) 



each player refreshes her strategy simultaneously according to a deterministic rule, depending on the state 
of the neighborhood. For evolutionary g ames this rule is usual ly determined by the payoffs in Eq. (56). 
For example, in the model suggested bv iNowak and May ( 19921 ) each player adopts the strategy of those 
neighbors (including herself) who achieved the highest income in the last round. 

Spatio-tempo ral patterns (or behaviors) occurring in these cellular automata can be classified into four 
distinct classes ( Wolframl . il 983[ 1984 . 2002 ). In Class 1 the evolution leads exactly to the same uniform final 
pattern (called frequently a "fixed point" or an "absorbing state") from almost all initial states. In Class 2 
the system can develop into many different states built up from a certain set of simple local structures that 
either remain unchanged or repeat themselves after a few steps (limit cycles). The behavior becomes more 
complicated in Class 3, where the time-dependent patterns exhibits random elements. Finally, in Class 4 
the time-dependent patterns involve high complexity (mixture of some order and randomness) and certain 
features of these nested patterns exhibit power law behavior. The best known example belonging to this 
universality class is the Game of Life invented by John Conway. This two-dimensional cellular automaton 
has a lot of localized struct ures (called an i mals t hat can blink, move, collide, com e to life, etc.) as discussed 
in detail by [Gardner ( 1970h and Siemundl (Il993t). Killingback and Doebeli ( 1998f ) have shown that the game 
theoretical construction suggested by Nowak and Mavl ( 1992T ) exhibits complex dynamics with long range 
correlations between states in both time and space. This is a characteristic feature in Class 4. 

A cellular automaton rule defines the new strategy for every player as a function of the strategy distribution 
in her neighborhood. For the Nowak-May model the spatio-temporal behavior remains qualitatively the 
same within a given range of payoff parameters, while larger variations of these parameters can modify 
substantially the behavior and can even put it into another class. In Section 6.5, we will demonstrate 
explicitly the consecutive transitions occurring in the model. 

Cellular automaton rules may involve stochastic elements. In this case, while preserving synchronicity, 
the update rule defines the probability to take one of the possible states for each site. In general, stochastic 
rules destroy stat es belonging t o Classes 2 and 4. This extension can also cause additional non-equilibrium 
phase transitions ( Kinzell . ll985l ). 

Agent heterog eneity can also be included. In the stochastic cellular automaton model suggested by 



Lim et al.l ([20021 ) each player has an acceptance level 5 X chosen randomly from a region — A < 8 X < A 
at her birth, and the best strategy of the neighborhood is accepted if U x < S x +max(U y ), y £ Cl x . This type 
of stochasticity can largely affect the behavior. 

Th ere is a direct way to make a cellular automaton stochastic ( Mukherii et al. , 1 1 9 9 6l : iTomochi and Kono , 



2002). If the prescription of the local deterministic rule is accepted at each site with probability fi, while 
the site remains in the previous state with probability 1 — /jl, the parameter fi characterizes the degree of 
synchronization. This update rule reproduces the original cellular automaton for fj, = 1, whereas for fi < 1 
it leads to stochastic update. This type of stochastic cellular automata allows us to study the continuous 
transition from deterministic synchronized update to random sequential evolution (limit \x — > 0) 

The effec ts of weak noise, 1 — u << 1, ha ve been studied from different points of view. iMukherji et al 



(1996) and Abramson and Kupermanl ( 2001 ) demonstrated that a small amount of noise can prevent the 
system to fall into a "frozen" state characteristic of Class 2. The sparsely occurring "errors" are capable 
of initiating avalanches, which transform the system into another meta-stable state. Under some conditions 
the size distri bution of these av a lanches exhib i ts pow er-law behavior, at least for some parameter regions, 
as reported bv lLim et all d2002l): Irlolme et al.l (120031). Details of the frozen patterns and the avalanches are 
discussed in detail bv IZimmermann and Egufiuzl (|200"5h . 



4.2. Random sequential update 

The other basic update mechanism is asynchronous update. In many real social systems the players modify 
their strategies independently of each other. For these systems random sequential (asynchronous) update 
gives a more appropriate description. One possibility is that in each time step one agent at random is selected 
from the population. Thus the probability per time of updating a given agent is A = l/N. Alternatively, 
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each agent may possess an independent "Poisson clock" , which "rings" for update according to a Poisson 
process at rate A. These assumptions assure that the simultaneous update of more than one agents has zero 
probability, and thus in each moment the macroscopic state of the population can only change a little. In the 
infinite population limit asynchronous update leads to smooth, continuous dynamics as we will see in Sec. 
4.4. Asynchronous strategy revisions are appropriate for overlapping generations in the Darwinian selection 
context. Also they are more typical in economics applications. 

In the case of random sequential update the central object of the agent-level description is the individual 
transition rate w(s — > s') which denotes the conditional probability per unit time that an agent, given the 
opportunity to update, flips from strategy s to s'. Clearly, this should satisfy the sum rule 

5>(* -*')=<* (58) 

s' 

(V = s included in the sum). In population games the individual transition rate only depends on the 
macroscopic state of the population w(s — ► s') = w(s — > s'; p). In network games the rate depends on the 
other agents' actual state in the neighborhood, including their current strategies or payoffs w(s x — > s' x ) = 
w(s x — * s' x ; {s y , Uyjygo^,). In theory the transition probabilities could also depend explicitly on time (the 
round of the game), or on the complex history of the game, etc., but we usually disregard these possibilities. 



4.3. Microscopic update rules 

In the following wc enlist some of the most representative microscopic rules, based on replication, imita- 
tion, and learning. 

Mutation and experimentation 

The simplest case is when the individual transition probability is independent of the state of the other 
agents. Strategy change arises due to intrinsic mechanisms with no influence from the rest of the population. 
The prototypical example is mutation in biology. The similar mechanism is often called experimentation in 
economics contexts. It is customary to assume that mutation probabilities are payoff-independent constants 

w(s — ► s') = c ss >. (59) 

such that the sum rule in Eq. (58) is satisfied. 

If mutation is the only dynamic mechanism without any other selective force, it leads to spontaneous 
genetic drift (random walk) in the state sp ace. If the pop ulation is finite there is a finite probability that 
a strategy becomes extinct by pure chance (iDrossell. 120011) . Drift usually has less relevance in large popula- 
tions [for exceptions see, e.g.. iTraulsen et alj (|2006d l]. but becomes an important issue in the case of new 
mutants. Initially mutant agents are small in numbers, and even if the mutation is beneficial, the probability 
of fixation (i.e., reaching a macroscopic number in the population) is less that one, because the mutation 
may be lost by random drift. 

Imitation 

Imitation processes form a wide class of microscopic update rules. The essence of imitation is that the 
agent who has the opportunity to revise her strategy takes over the strategy of one of the fellow players with 
some probability. The strategy of the fellow player remains intact; it only plays a catalyzing role. Imitation 
cannot introduce a new strategy which is not yet played in the population; this rule is non-innovative. 

Imitation processes can differ in two respects: whom to imitate and with what probability. The standard 
procedure is to choose the agent to imitate at random from the neighborhood. In the mean-field case 
this means a random partner from the whole population. The imitation probability may depend on the 
information available for the agent. The rule can be different if only the strategies used by the neighbors 
are known, or if both the strategies and their resulting last-round (or accumulated) payoffs are available 
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for inspection. In the first case the agent should decide only knowing her own payoff. Rules in this case are 
similar in spirit to the Win-Stay-Lose-Shift rules that we discuss later on. 

In the case when both strategies and payoff can be compared, one of the simplest imitation rules is Imitate 
if Better. Agent x with strategy s x takes over the strategy of another agent y, chosen randomly from x's 
neighborhood £l x , iff j/'s strategy has yielded higher payoff, otherwise the original strategy is maintained. If 
we denote the set of neighbors of agent x who play strategy s by fl x (s) C Q x , the individual transition rate 
for s' x 7^ s x can be written as 

«;(**- 4) = jjY-r E Wv-V'l ( 6 °) 

where 9 is the Heavisidc function, A > is an arbitrary constant, and is the number of neighbors. In 
the mean-field case (population game) this simplifies to 

w( Sx -+ 4) = Xp s ,J[U{s' x ) - U(s x )). (61) 

Imitation rules are more realistic if they take into consideration the actual payoff differences between the 
original and the imitated strategies. Along this line , an update rule with nice dynamical properties, as we 
will see, is Schlag's Proportional Imitation 1 Helping! Il998t ISchlad . 19981 19991) . In this case another agent's 



strategy in the neighborhood is imitated with a probability proportional to the payoff difference, provided 
that the new payoff is higher than the old one: 



- O = |jTT E (62) 



Imitation only occurs if the target strategy is more successful, and in this case its rate goes linearly with 
the payoff difference. For later convenience we give again the mean-field rates 

w(s x -> 4) = X P< max[C/(4) - U{s x ),0}. (63) 

Proportional imitation does not allow for an inferior strategy to replace a more successful one. Update 
rules which forbid this are usually called payoff monotone. However, payoff monotonicity is frequently broken 
in case of bounded rationality, and a strategy may be imitated with some finite probability even if it has 
produced lower payoff in earlier rounds. A possible general form of Smoothed Imitation is 

4) = "j^T E 9{U y -U x ) (64) 
where g is a monotonically increasing smoothing function, for instance, 

1 + exp(— Au/K ) 

where K can measure the extent of noise. 

Imitation rules can be generalized to the case when the entire neighborhood is monitored simultaneously 
and plays a collective catalyzin g role. Denoting by VL X = {x,£l x } the neighborhoo d which includes agent x 
as well, a possible form us ed by Nowak et al. ( 1994al lb) and their followers (e.g., ( Alonso-Sanz et all l200ll ; 



Masuda and Aiharal . 12003!) ) for the Prisoner's Dilemma is 



w(s x ^s x ) = \ — * . (66) 

2jy v 9{U y ) 

where again g(U) is an arbitrary positive smoothing function, and the sum in the numerator is limited to 



agents in the neighborhood who pursue the target strategy s' x . Nowak et alj ( 1994al lbh considered g(z) 



z k 



(z > 0). This choice reproduces the deterministic rule Imitate the Best (in the neighborhood) in the limit 
k — > oo. Most of their analysis, however, focused on the linear version k = 1. When k < oo this rule is not 
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payoff monotone, but assures in general that strategies which perform better on average in the neighborhood 
have higher chances to be imitated. 

All types of imitation rules have the possibility to reach a homogeneous state sooner or later when the sys- 
tem size is finite. Once the homogeneous state is reached the system remains there forever, that is, evolution 
stops. Homogeneous strategy distributions are absorbing states for imitative behavior. The time to reach 
one of the absorbing states, however, increases very rapidly with the system size. With a small mutation rate 
added, the homogeneous states cease to remain steady states, and the dynamical process becomes ergodic. 



loran process 



Although imitation seems to be a conscious decision process, the underlying mathematical structure 
may emerge in dynamic processes which have nothing to do with higher level consciousness. Consider, 
for instance , the Moran process, which is bi ology's prototype dynamic update rule for asexual (haploid) 



replication ( Moranl . Il962 ; Nowak et al. . 2004f ). At each time step, one individual is chosen for reproduction 



with a probability proportional to its fitness. (In the game theory context fitness can be the payoff or a 
non-negative, monotonic function of the payoff.) An identical offspring (clone) is produced, which replaces 
another individual. In the mean-field case this latter is chosen randomly from the population. The population 
size TV remains constant. Update in the population is asynchronous. 

The overall result of the Moran process is that one individual of a constant population forgoes her strategy, 
and instead takes over the strategy of another agent with a probability proportional to the relative abundance 
of that strategy. The Moran process can be viewed as a kind of imitation. Formally, in the mean-field case, 
the individual transition rates read 



w(s x -> 4) = X Ps 



U(s' x ) 
U ' 



(67) 



where U = p s U(s) is the average fitness of the population. The right hand side is simply the frequency 

of s'-strategists, weighted by their relative fitnes s. This is formally t h e rule Eq. (66) with g(z) = z. 

The Moran process is inve s tigate d recently by Nowak et al. ( 2004 ); Taylor et al. ( 20041 ) ; [Wild and Taylor 
(|2005t ); lAntal and Scheurind (j2006tk iTraulsen et all (|2006bl ) ,o determine the size-dependence of the relax- 
ation time and the extinction probability characterizing how the system reaches a homogeneous absorbing 
state via random fluctu ations. An explicit mean-field description in the form of Fokkcr-Planck equation was 
derived and studied by ( Traulsen et all 12005. 2006a). 



Better and Best Response 

Imitation dynamics, including the Moran process, considered thus far are all non- innovative dynamics, 
which cannot introduce new strategies into the population. If a strategy becomes extinct, a non-innovative 
rule cannot revive it. In contrast with this, dynamics which can introduce new strategies are called innovative. 

One of the most important innovative dynamics is the Best Response rule. This assumes that agents, 
when they get a revision opportunity, adopt their best possible strategy (best response) to the current 
strategy distribution of the neighborhood. The Best Response rule is myopic, agents have no memory and 
do not worry about the long run consequences of their strategy choice. Nevertheless, best response require 
more cognitive abilities than imitation: (1) the agent should be aware of the distribution of the co-players' 
strategies, and (2) should fully know the conceivable strategy options. Note that these may not be realistic 
assumptions in complicated real- life situations. 

Denoting by s- x the current strategy profile of the population in the neighborhood of agent x, and by 
BR(s_ I .) the set of the agent's best replies to s_ x , the Best Response rule can be formally written as 

/ A/|BR| if4eBR( S _ E ), 
w{s x -> sj - < (68) 
[0 if4^BR(s_ x ) 

where |BR| is the number of possible best replies (there may be more than one). 
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The player may not be able to assess the distribution of strategies in the population (neighborhood) . but 
may remember the ones confronted wit h in earlier ro unds. Such setup leads to Fictitious Play, a prototype 
dynamic rule studied already in the 50s (|Brownl . ll95lh . in which best response is given to the overall empirical 
distribution of former rounds. 

Bounded rationality may also imply that the agent is only able to consider a limited set of opportunities 
and optimize within this set (Better Response). For instance, in the case of continuous strategy spaces, 
taking a small step in the direction of the local payoff gradient is called Gradient Dynamics. The new 
strategy adopted is 



dU(s x , s_ x ) 
ds T 



Ai 



(69) 



which improves the player's payoff by a small amount. 

In many cases it is reasonable to assume that the strategy update is a stochastic process, and instead of a 
sharp Best Response a smoothed rule is posited. In Smoothed Best Response the transition rates are usually 
written as 



w(s 2 



5 ) = A 



-.•)] 



Z s <>eS»9[U(s» ]S - x )] 



(70) 



where g is a monotonically increasing positive smoothing function assuring that better strategies have more 
chance to be adopted. A typical choice, as will be discussed in the next Section, is when g takes the Boltzman 
form 



w(s x — > s' x ) = A 



exp[U(s' x ;s- x )/K] 

Es^s^Mu(s'^s. x )/Ky 



(71) 



This rule is usually called the "Logit rule" or the "Log-linear rule" in the game theory literature (jBlume 



19931) 



Smoothed Best Response is not payoff monotone, and is very similar in form to smoothed imitation in Eq. 
(66). The major difference is the extent of the strategy space: imitation only considers strategies actually 
played in the social neighborhood, whereas the variants of the best response rules challenge the agent's all 
feasible strategies, even if they are not played in the population. 

Smoothed Best Response can describe a myopic player who is capable of estimating the variation of her 
own payoff upon a strategy change while assuming that the neighbors' strategies remain unchanged. Such a 
player always adopts (rejects) the advantageous (disadvantageous) strategy in the limit K — > but makes 
a boundedly rational (noisy) decision with some probability of mistaking when K > 0. The rule in Eq. 
(71) may also model a situation in which players are rational but payoffs have a small hidden idiosyncratic 
random component or when gather i ng pe rfect information about decision alternatives is costly ( Blumel . 12003 : 
iBlume and DurlauA 120031 : iHelbind . Il996l ) . All these features together can be handled mathematically by the 
"temperature" parameter K. 

Such dyna mical rules were used by lEbel and BornholdtJ (|2002f) in the consideration of the Prisoner's 
Dilemma, b y Svsi-Aho et al.l (2005) w ho studied an evolutionary Snow-drift game, and by iBlumd (|l993l . 
19981 l2003h : lBlume and Durlauj (|2003l ) in Coordination games. 



Win-Stay- Lose-Shift 

When the co-players' payoffs are unobscrvablc, the player should decide about a new strategy in the 
knowledge of her own payoffs realized earlier in the game. She may have an "aspiration level" : when her last 
round payoff (or payoffs averaged over a number of rounds) is below this aspiration level, she shifts over to 
a new strategy, otherwise she stays with the original one. The new (target) strategy can be imitated from 
the neighborhood or chosen randomly from the agent's own strategy set. 

This is the philosophy behind the Win-Stay-Lose-Shift rules. These rules have been frequently discussed 
in connection with the Prisoner's Dilemma, where there are four possible payoff values S<P<R<T'\ns. 
stage game (see Appendix A. 7). If the aspiration level is set between the second and third payoff values, and 
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only the last round payoff is considered (no averaging), we obtain the so-called "Pavlov rule" [also called 
"Pavlov strategy" , when considered as an elementary (non-meta) strategy] . The individual transition rate 
can be written as 



w(s x 



,) = XQ(a-U x ); 



P <a<R 1 



(72) 



where a is the aspiration level, s x = C,D a nd s x = D,C (the opposite of s), respectively. Pavlov is known 
to beat Tit-for- Tat in a noisy environ ment (jNowak and Sigmundl . Il993h . In the spatial versions the Pavlov 
rule was used bv lFort and Violal (|2005h . who observed that the size distribution of C (or D) strategy clusters 
follows power-law scaling. 

Other aspiration levels, with or without averaging, define other variants of the Win-Stay-Lose-Shift class 
of (meta-) strategies. The set of these rules can be extended by a llowing a dyna mic change of the aspiration 
level, as it appears frequently in human and animal examples (|Colmanl . [l995h . These strategies involve a 
way how the aspiration level changes step by step knowing the previous payoffs. For example, the so-called 
Yester day strategy repeats the previous action if, and only if, the payoff is at le ast as good a s in the previous 
round (jPosch et al.l . Il999h . Other versions of these strategics are discussed in (IPoschl 11999). 



4.4. From micro to macro dynamics in population games 

The individual transition rates introduced in the former section can be used to formulate the dynamics in 
terms of a master equation. This is especially convenient in the mean-field case, i.e., for population games. 
For random sequential update in each infinitesimal time interval there is at most one agent who considers 
a strategy change. When the change is accepted, the old strategy i is replaced by a new strategy j, thereby 
decreasing the number of players pursuing i by one, and increasing the number of p layers pursuing j by one 



decreasing the number 01 players pursuing i by one, and increasing tne number ot p layers pursuing 7 by one 
in th e population. These processes are called Q-type birth-death Markov processes ( Blume . 1998t iGardinerl 



20041 ). where Q is the number of different strategies in the game. 
Let rii denote the number of i-strategists in the population, and 

n = {n 1 ,n 2 ,...,n Q }, J~] m = N, (73) 

i 

the macroscop i c con figuration of the system. We introduce n' 3 '' as a shorthand for the configuration 
(|Helbind . [T996l . ll998h 



n 



{ni,n 2 , . . .,nj - 1, . ..,nk + 1, •• ■ ,«q}, (74) 



which differs from n by the elementary process of changing one j-strategist into a fc-stratcgist. The time- 
dependent probability density over configurations, P(n,t), satisfies the master equation 

dP ^ t 7 ^ = t P ( n '' *) W ( n ' -►»»)- P{n, t)W(n -> n')} . (75) 
n' 

where W(n — > n') is the configurational transition rate. The first (second) term represents the inflow 
(outflow) into (from) configuration n. This form assumes that the update r ule is a Markov process with no 
memory. [For a generalized master equation with a memory effect see, e.g., iHelbind (Il998l) ]. 



The configurational transition rates can be calculated from the individual transition rates. Notice that 
W(n — > n 1 ) is only nonzero if n' — n^ k ' with some j and k, and in this case it is proportional to the number 
of j strategists nj in the population. Thus we can write the configurational rates as 

W(n -> n') = ^ nj w(j -> k; n) <5 n ,. nUfc) . (76) 

The master equation in Eq. (75) describes the dynamics of the configurational probabilities. Knowing the 
actual state of the population in a given time means that the probability distribution P(n, t) is extremely 



39 



sharp (delta function). This initial probability density becomes wider and wider as time elapses. When 
the population is large, this widening is slow enough such that a deterministic approximation can give 
a satisfactory description. The temporal trajectory of the mean strategy densities, i.e., t he state vector 
p 6 Aq obeys a deterministic, continuous-time, first-order ordinary differential equation ( Helping . 19961 . 
19981 : IBenaim and Weibulll . liooj ) . 



In orde r to show this, w e define the (time-dependent) average value of a quantity / as (/) = J2n f( n )P( n i 
Following iHelbind (|l998l) . we express the time derivative of (n*) from the master equation as 

^> = K - m)W(n - n')P(n, t). (77) 
n,n' 



Using Eq. (76) we get 

Up 
dt 



n,j,k 



= ^ [ n j w (j ->i;n)- n,iw(i -> j; n)} P(n, t) 
n,j 

(78) 

where we have used that nf k ^ — rii = 5^ — Sij. 

Equation (78) is exact. However, to get a closed equation which only contains mean values an approxi- 
mation should be made. When the probability distribution is narrow enough, we can write 

(niw(i -> j;n)) w (m) w{i -> j; (n)). (79) 

This approxima t ion leads to the approximate mean value equation for the strategy frequencies pi(t) = (rii)/N 
(jWeidlichl . Il99ll ; IHelbind . 1 19981 1 19961 : IBenaim and Weibulllioolh : 

^ = J2 IPoitMi - *; p) - PiitMi ^ i; p)\ ■ (so) 

j 

This equation can be used to derive the macroscopic dynamic equations from the microscopic update 
rules. Take, for instance, Proportional Imitation in Eq. (63). Equation (80) then gives 

= P]P*( u * - u j) - I'-py-"- - u i) ( 81 ) 

where S + (<S_) denotes the set of strategies superior (inferior) to strategy i. It can be readily checked that 
this leads to 

~^T~ = ^Pi( u i — U) U = ^ pjUj (82) 

3 

where u is the average payoff in the population. This is exactly the replicator dynamics in the Taylor-Jonker 
form, Eq. (41). 

If the microscopic rule is chosen to be the Moran process, Eq. (67), the approximate mean value equation 
leads instead to the Maynard-Smith form of the replicator equation, Eq. (42). Similarly, the Best Response 
rule in Eq. (68) or the Logit rule in Eq. (71) lead to the corresponding macroscopic counterparts in Eqs. 
(50) and (53). 

Since Eq. (80) is only approximate, the actual behavior in a large but finite population will eventually 
deviate from the solution of Eq. (80). Going beyond the approximate mean value equation requires a Taylor 
expansion of the right hand side of Eq. (78), and leads to an infinite hierarchy of coupled equations involving 
higher moments. This hierarchy should be truncated by an appropriate decoupling scheme. The simplest 
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(second order) approxi mation beyond Eg. (80) yields two c oupled equations: one for the me an and one for 
the covariance (niTij)t (<Weidlichl . Tl99lt iHelbind . Il998l . 1 1996h [for an alternative approach see iBinmore et al 
(fl995h ]. 



4.5. Potential games and the kinetic Ising model 



The kinetic Ising model, introduced by Idauber ( 19631 ). exemplifies how a static spin model can be ex- 
tended by a dynamic rule such that the dynamics drives the system towards the thermodynamic equilibrium 
characterized by Boltzmann probabilities. 

In Glauber dynamics the following steps are repeated: (1) choose a site (player) x at random; (2) choose 
a possible spin state (strategy) s' x at random; 3) change the configuration s = {s x ,s^ x } to s' = {s' x ,s- x } 
with a probability 



W(s 



l + exp(-AE(s,s')/K)' 



(83) 



where AE(s, s') = E(s') — E(s) is the change in energy of the system with respect to the initi al and final spin 



configu rat ions. The p arameter K is the "temperature" which measures the extent of noise. iTraulsen et al 



()2007al lbh: [PCTc1 ( 20061 ) have shown that the stochastic evaluation of payoffs can be interpreted as an increased 
temperature. Notice that w(s x — * s' x ) — » 1 (0) if AE » K (AE « K ) as illustrated in Fig. 11. In the 
K — ► limit the dynamics becomes deterministic. 




-5K -4K -3K -2K 



Fig. 11. Transition probability (83) as a function of AE provides a smooth transition from to 1 within a region comparable 
to K. 

Glauber dynamics allows one-site spin flips with a probability depending on the energy difference AE 
between the final and initial states. The equilibrium distribution P ( s) of this stochastic process is known to 
satisfy the condition of detailed balance (Glauber, 19631 iKawasakil Il972t ) 



W{s -> s')P(s) = W(s' -> s)P(s'). (84) 

In equilibrium there is no net probability current flowing between any two microscopic states connected by 
the dynamics. If the detailed balance is satisfied, the equilibrium can be calculated trivially: independently 
of the initial state, the dynamics drives the system towards a limit distribution in which the probability of 
a given spin configuration s is given by the Boltzmann distribution at temperature K, 

eM-E(s)/K) 

P{s) -Es^M-E(s')/K)- (85) 

The equilibrium distribution in Eq. (85) can be reached by other transition rates as well. The only 
requirement following from the detailed balance condition Eq. (84) is that the ratio of forward and backward 
transitions should be 

Zt^ S s) =eM ~ AE{S,S ' )/K) - (86) 
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Equation (83) corresponds to one of the standard choices with single spin flips, where the sum of the forward 
and backward transition probabiliti es equal to 1, bu t this choice is largely arbitrary. Alternative dynamics 
may even allow two spin exchanges (jKawasaki , 1 19721) . 

We remind the readers that Glauber dynamics favors elementary processes reducing the total energy in the 
model, while the contact with the heat reservoir provides an average energy depending on the temperature 
K. In other words, this dynamics drives the system into a (maximally disordered) macroscopic state where 
the entropy 



s 



P{s)\nP{s)-. 



(87) 



reaches its maximum, if the aver age energy is fix e d. The mathe matical details of this approach are well 
described in many books [see e.g., IChandleri (|l987h : Irlakenl (|l988l )]. 

Glauber dynamics, as introduced above, defines a simple stochastic rule whose limit distribution (equi- 
librium) turns out to be the Boltzmann distribution. In the following we will show th at the Logit rule, as 
defin ed in Eq. (71), is equivalent to Glauber dynamics when the game has a potential (jBlumd . Il993l . 11995 . 
Il998h . In order to prove this we have to show two things: the ratio of forward and backward transitions 
satisfy Eq. (86) with a suitable energy function E, and the detailed balance condition Eq. (84) is indeed 
satisfied. 

The first is easy: for the Logit rule, Eq. (71), the ratio of forward and backward transitions reads 



W(s s') w(s x 4) 



exp(AU x /K). 



(88) 



W(s' — ► s) w(s' x — > s a 
If the game has a potential V then Eq. (8) assures that 

AU X = U x {s' x , s- x ) - U x (s x ,s^ x ) 
= V(s x , s- x ) - V(s x , s_ x ). 

Thus if we define the "energy function" as E({s x }) = — V({s x }) then Eq. (86) is formall y satisfied. 

As for the detailed balance req uirement what we have to check is Kolmogorov's Criterion ( Freidlin and Wentzelll . 
19841 iBlumel . 11998c iKelM |1979|) : detailed balance is satisfied if and only if the product of forward transi- 



(89) 



tion probabilities equals to the product of backward transition probabilities along each possible closed loop 

£ :S (D _> a (2) 



s( L ) — > s' 1 ) in the state space, i.e. 



Rl 



W(s 



(i) 



) W(s 



(2) 



,(3) 



W{s 



(L) 



W(s 



(2) 



s (i)) W{a<® -> s( 2 )) 



W(s 



(i) 



1. 



(90) 



Fortunately, it suffices to check the criterion for one-agent three-cycles and two-agent four-cycles: each closed 
loop can be built up from these elementary building blocks. 

For one-agent three-cycles £3 : s x — ► s' x — > s x 
Logit rule satisfies trivially the criterion: 



Rl 



cxp 



cxp 



U%\S X )S—x) U X ( K S X: S— X ') 
K 

Uxi^Sx-j S — x) Uxi^Sxi S—x) 

K 



cxp 



= 1, 



s x (with no change in the other agents' strategies) the 
U x (s' x ,s- X ) - U x (s' x ,s- x y 



K 



(91) 



even if the game has no potential. However, the existence of a potential is essential for the elementary 



four-cycle, £4 : (s x ,s y ) 
strategies). In this case 



(s' x ,s y ) — > (s' x ,s' y ) — > (s x ,s' ) — > (s x ,s y ) (with no change in the other agents' 



cxp 



U x (S x ,Sy) 

U x (s x ,s' y ) 



U X (s x ,Sy) + Uy{S x , S' y ) - Uy(s' x ,Sy) 
Ux(s' x ,Sy) + Uy(S x , Sy) - Uy(S x ,Sy) 



(92) 
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where the notation was simplified by omitting the reference to other, fixed-strategy players, U x (s x , s y ) 
U x (s 

I s -{x,y})- If the game has a potential, and thus Eq. (89) is satisfied, this can be written as 



exp 
cxp 



V(S' X , Sy) ~ V{ Sxi Sy) + V(s' x , S ' y ) - V(S' X , Sy) 

V(s x , s' y ) - V(s' x ,s' y ) + V(s XlSy ) - V(s x , s' y ) 



1. 



(93) 



Kolmogorov's criterion is satisfied since all terms in the exponent cancel. Thus we have shown that for 
potential games the Logit rule is equivalent to Glauber dynamics, consequently the limit distribution is the 
Boltzmann distribution at temperature K with energy function — V, 



P(s) = 



exp(V(s)/K) 
Es' c MV(s')/K) 



(94) 



The above equivalence with kinetic Ising-typc models (Glauber dynamics) opens a direct way for the 
application of statistical physics methods in this class of evolutionary games. For example, the mutual 
reinforcement effect between neighboring (connected) agents can be considered as an attractive interaction 
in case of Coordination games. A typical example is when agents s hould select between two c ompeting 
techn ologies like LINUX and WINDOWS (|Lee and Valentinvil . |2000| ) or VHS and BETAMAX (jHelbind . 
1996t ) . When the connectivity structure is characterized by a square lattice and the individual strategy 
update is defined by the Logit rule (Glauber dynamics) then the stationary states of the corresponding 
model are well described by the thermodynamic behavior of the Ising m odel, which exhibits a critical phase 
transition in the absence of an external magnetic field ( Stanley . 197lh . Moreover, the ordering processes 
(including the dec ay of metastable states, nucleation, and domain g rowth) are also well investigated for 
Ising- type models (jBinder and Muller-Krumbhaan, Il974t iBravl . Il994f ). The behavior of potential games is 
similar to that of the equivalent Ising-type model. 

The above analogy can also be useful in games which do not strictly have a potential, but which are close 
to a potential game that is equivalent to a well-known physical system. For example a small parameter e can 
characterize the symmetry-breaking of the payoff matrix (Ay — Aji = ±2s or 0) as ha ppens when consider 



ing t he combination of a three-state Potts model with the Rock-Scissors-Paper game ( Szolnoki and Szabdl . 
20051 ) . One can study how the cyclic dominance (e > 0) can prevent the formation of long-range order 
described by the three-state Potts model (e = 0) b elow the critical tempera ture. A similar idea was used 
for the so-called "driven lattice gases" suggested bv iKatz et al. ( 1983 . 19841 ). who studied the effect of an 
external electric field (inducing particle transport through the system) on the ordering process controlled b y 
nearest-neighbor interactio ns in Ising-type lattice gas models [for a review see lSchmittmann and Zial ([19950 : 
Marro and Dickmanl <|l999h ] . 



4.6. Stochastic stability 

Although stochastic update rules are essential at the level of individual decision making, the fluctua- 
tions average out and produce smooth, deterministic population dynamics on the aggregate level, when the 
population is infinitely large and all pairs are connected. In the case of finite populations, however, the de- 
terministic approximation discussed in Section 4.4 is no longer exact, and usually it only prov ides acceptable 
prediction for the short run behavior (jBenaim and Weibulll . 120031 ; iReichenbach et all l2006al ) . The long-run 
analysis requires the calculation of the full stochastic (limit) distribution. Even if the basic micro-dynamics 
is deterministic, a small stochastic noise, originating from mutations, random errors in strategy implementa- 
tion, or deliberate experimentation with new strategies may play a crucial role. It turns out that predictions 
based on the asymptotic behavior of the deterministic approximation may not always be relevant for the long 
run behavior of noisy systems. The stability of dynamic equilibria with respect to pertinent perturbations 
becomes a fundamental question. The concept of noise tolerance can be introduced as a novel refinement 
critcrium to provide additional guideline for realistic equilibrium selection. 
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Fig. 12. The Stag Hunt game with uniform noise. The JV-state birth-death Markov process can be approximated by a two-state 
process in the small noise limit. 



The analysis of stochastic perturbations goes back to Foster and Young] (1990) who define the notion 
of stochastic stability. A stochastically stable set is a set of strategy profiles, which appear with nonzero 
probability in the limit distribution of the stochastic evolutionary process, when the noise level becomes 
infinitesimally small. As such, stochastically stable sets may contain states which are not Nash equilibria 
such as limit cycles. Nevertheless, for many important games this set contains a single Nash equilibrium. 
For instance, in the case of 2x2 Coordination games, where the game has two asymptotically stable Nash 
equilibria, only one of these, the risk-dominant equilibrium proves to be stochastically stable. The connection 
between stoch astic stability and risk dominance was found to hold un der rather wide assumptions on the 
dynamic rule (jKandori et all Il993t lYoungl . 11993 : iBlumeL Hfffli l2003h . Stochastically stable equilibria are 
sometimes c alled "long-run equili bria" in the literature. 



Following iKandori et al. (|l993l) . we illustrate the concept of stochastic stability by a simple, symmetric, 



two-strategy Coordination game 





Hare Stag 


Hare 


a=4 b=3 


Stag 


c=0 d=5 



(95) 



played as a population game by a finite number N of players. We can think about this game as the Stag Hunt 
game, where a population of hunters learn to coordinate on hunting for Stags or for Hares. (See Appendix 
A. 12 for a detailed description). Coordinating on Stag is Pareto optimal, but choosing Hare is less risky if 
the hunter is unsure of the coplayer's choice. As for a definite micro-dynamics, we assume a deterministic 
Best Response rule with asynchronous update, perturbed by a small probability s that a player, when it is 
her turn to act, makes an erroneous choice (noise). 

Let Ns = Ns(t) denote the number of Stag hunters in the population. Without mutations the game 
has two Nash equilibria: (1) everybody hunts for Stag ("AllStag", i.e., Ns = N), and (2) everybody hunts 
for Hare ("AllHare", i.e., Ns = 0). Both are evolutionarily stable. There is also a third, mixed strategy 
equilibrium with N s = 2iV/3, which is not evolutionarily stable. For the deterministic dynamics the first 
two NEs are stable fixed points, the third is an unstable fixed point playing the role of a separatrix. When 
Ns(t) < Ng (Ns(t) > Ng) each agent's best response is to hunt for Stag (Hare). Depending on the initial 
condition one of the fixed point is readily reached by the deterministic rule. 

When the noise rate is finite, e > 0, we should calculate a probability distribution over the state space, 
which satisfies the appropriate master equation. The process is a one-type, iV-state birth-death Markov 
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process, which satisfies detailed balance, and has a unique stationary distribution P £ (Ns). It is intuitively 
clear that in the s — > limit the probability will cluster on the two deterministic fixed points, and thus the 
most important features of the solution can be obtained without detailed calculations. When e is small we 
can ignore all interme diate states and estimate an effective transition rate between the two stable fixed points 
( Kandori et al. . 1993f ). Reaching the border of the basin of attraction from AllHare requires Ng subsequent 



mutations, whose probability is C(e s ) (see Fig. 12). From here the other fixed point can be reached 
by a rate 0(1). Thus the effective transition rate from AllHare to AllStag is A = 0(e N s) = 0(e 2N/3 ). 
A similar calculation gives the effective rate for the inverse process A' = 0(s N ~ N s) = 0(e N ^ 3 ). Using 
this approximation, the stationary distribution over the two states AllHare and AllStag (neglecting all the 
intermediate states) becomes 

We can see that in this example as e — > the probability weight clusters on the state AllHare, showing that 
only this NE is stochastically stable. Of course, this qualitative result would remain true if we calculated 
the full limit distribution of this stochastic problem. 

Even though the system flips occasionally into AllStag by a series of appropriate mutations, it spends 
most of the time near AllHare. If we look at the system at a random time we can almost surely find it in 
the state (more precisely, in the domain of attraction) of AllHare. The deterministic approximation based 
on Eq. (80), which predicts AllStag as an equally possible solution depending on the initial condition, is 
misleading in the long run. It should be noted, however, that when the noise level is low or the population 
is large there is a huge time needed to take the system out from a non-stochastically stable configuration, 
which clearly limits the practical relevance of the concept of stochastic stability. 

In the example above we have found that only AllHare is stochastically stable. AllStag, which is otherwise 
Pareto optimal is not stochastically stable. Of course, this finding depends on the actual parameters in the 
payoff matrix. In general (still keeping the game to be a Coordination game), we find that the stochastic 
stability of AllHare holds iff 

a-c>d-b. (97) 

This is exactly the condition that AllHare is risk-dominant, ex emplifying the general connection between 
risk dominance and stochastic stability in this class of games (|Kandori et all Il993l ; lYound . Il993f) . When 



Eq. (97) loses validity, the other NE, AllStag, becomes risk-dominant and, at the same time, stochastically 
stable. 

Stochastic stability probes the limit distribution of the stochastic process in the zero noise limit for a 
given finite population. Another interesting question arises if the noise level is low but fixed, and the size 
of the population is taken to infinity. In this limit any given configuration (strategy profile) will have zero 
probability, but the distribution may concentrate on one Nash configuration and its small pertu rbations 
where most players play the same strategy. Such a Nash configuration is called ensemble stable 
2004b| |a||cj). The concepts of stochastic stability and ensemble stability may not necessarily coincide 
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2004bl lallcf). 



5. The structure of social graphs 

In realistic multi-player systems players do not interact with all other players. In these situa tions the 



in realistic multi-player systems players do not interact witn all other players, in these situa tions the 
connectivity between players is given by two graphs as suggested recently bv lOhtsuki et al. ( 2007al fb). where 



the nodes represent players, and the edges, connecting the nodes, refer to the connection between the 
corresponding players. The first graph defines how the connected players play a game to gain some income. 
The second graph describes the learning (strategy adoption) mechanism. In a more general formalism the 
edges of graphs may have two weight factors characterizing the stre ngth of influence along both directions . 
These extensions a llow us to study the effects of preferential learning (jLieberman et aL . l2005l ; IWu et al.l . l2006t 



Guan et al. . 20061 ) that can involve an asymmetry in the teaching-learning activity between the connected 
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players (jKim et al.l . |2002| ; ISzolnoki and Szabd 12007! ) . The behavior of evolutionary games is strongly affected 
by the underlying social structure. 

Henceforth our analysis concentrates on systems, where both the game theoretical interaction and the 
learning mechanism are based on th e same ne t work o f connect i vity, a nd the weight of an edge is unity. The 
theory of graphs [see textbooks, e.g., iBollobasI dT985h : lBollobas1 (Il998h ] gives a mathematical background for 
the structural analysis. In the last decades an extensive research has focused on the different random n e twork s 
[for detai l ed survey see Amaral et all d2000l) ; Albert and Barabasi ( 2002 ); Dorogovtsev and Mended ( 2003 ); 
Newman (|2003f ); lBoccaletti et al.l (|2006l)~ that can be considered as potential structures of connectivity. The 
investigation of evolutionary games on these structures has lead to the exploration of new phenomena and 
raised a number of interesting questions. 

The connectivity structure can be characterized by a number of topological properties. In the following we 
assume that the corresponding graph is connected, i.e., there is at least one path along edges between any 
two sites. Graph theory defines the degree z x of site a; as the number of neighbors (co-players) connected to 
x. For random graphs the degree distribution f(z) determines the probability of find ing exactly z neighbors 
for a player. The degree is uniform, i.e., f(z) — 5{z — zq), for regular structure d 11 1 (e.g., for lattices), and 
f(z) oc z~~ 1 (typically 2 < 7 < 3) for the so-called scale-free graphs. These latter exhibit sites with extremely 
large number of neighbors. Real networks can have statistical properties in between these extre me limits. It 
is customary to classify structures according to their degree distributions 1 Amaral et al. . 200fj ) as 1) scale- 
free networks (e.g., the world-wide web) having a power-law tail for large z, 2) truncated scale-free networks 
(e.g., the network of movie actors) with an extended power-law behavior up to some large z cutoff, and 3) 
single-scale networks (e.g., some friendship networks) with an exponential or Gaussi an dependence in the 
who le regime with a c harac teristic degree. For other specific real-life examples see, e.g., Amaral et al. ( 2000l ) 



and lBoccaletti et al.l (|2006l) . 



In graph theory a clique means a complete subgraph in which all pairs are linked together. The clustering 
coefficient characterizes the " cliquishness" of the closest environment of a site. More precisely, the clustering 
coefficient C x of site x is the proportion of actual edges among the sites within its neighborhood to the 
number of potential edges that could possibly exist among them. In a different context the clustering 
coefficient characterizes the fraction of possible triangles (two edges with a shared site) which are in fact 
triangles (three-site cliques) . The distribution of C can be also introduced but usually only its average value 
C is considered. In many cases the percolation of the overlappin g triangles (or cliq ues with more site) seems 



to be the crucial feature responsible for interesting phenomena (jPalla et all 1200 



Now we briefly survey some prototypical connectivity structures, which have been investigated actively in 
recent years. First we have to emphasize that a mean-field-type behavior occurs for two drastically different 
graph structures as N — > 00 (see Fig. 13). On one hand, the mean-field approximation is exact by definition 
for those systems where all the pairs are linked, that is, the corresponding graph is complete. On the other 
hand, similar behavior arises if each player's income comes from games with only z temporary players who 
are chosen randomly in every step. In this latter setup there is no correlation between partners from one 
stage to the next. 

5.1. Lattices 

For spatial models the fixed interaction network is defined by the sites of a lattice and the edges between 
those pair whose distance does not exceed a given value. The most frequently used structure is the square 
lattice with von Neumann neighborhood (including connections between nearest neighbor sites, z = 4) 
and Moore neighborhood (with connections between nearest and next-nearest neighbors, z = 8). Many 
transitions from one behavior to a distinctly different one depend on the dimensionality of the embedding 
space, therefore the considerations can be extended to d-dimensional hyper-cubic structures too. When 
using periodic boundary conditions these systems become translation invariant, and the spatial distribution 



[ A graph is called regular, if the number of links z x , emanating from node x, is the same z x = z for all x. 
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Fig. 13. Connectivity structures for which the mean- field approach is valid in the limit N — * oo. On the right hand side the 
dashed lines indicate temporary connections to co-players chosen at random for a given time. 



of strategies can be investigated by mathematical tools developed in solid state theory and non-equilibrium 
statistical physics (see e.g., Appendix C). 

In many cases the regular lattice only provides an initial structure for the creation of more realistic social 
networks. For example, diluted lattices (see Fig. 14 ) can be used to stu dy what happens if a portion q of 
players and/or interactions are removed at random ( Nowak et al. . 1994al ). The resultant connectivity struc- 
ture is inhomogeneous and we cannot use analytical methods assuming translation invariance. A systematic 
numerical analysis in the limit q — ► 0, however, can give us a picture about the effect of these types of 
defects. 
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Fig. 14. Two types of diluted lattices. Left: randomly chosen sites are removed together with the corresponding edges. Right: 
randomly chosen edges are removed. 

For many (partially) random connectivity structures more than one topological features change simulta- 
neously, and the individual effects of these are mixed up. The clarification of the role a feature may play 
requires their separation and independent tuning. Figure 15 shows some regular connectivity structures (for 
z = 4), whose topology is strikingly different, meanwhile the clustering coefficients are all zero (or vanishing 
in the limit N — > oo. 

5.2. Small worlds 



In Figure 15 a regular small-world network is created from a square lattice by randomly rewiring a fraction 
q of connections in a way that conserve the degree for each site. Random links reduce drastically the average 
distance / between ra ndomly chosen pair of sites, producing a small world phenomenon characteristic to 
many social networks (IMilgraml , Il967l ). In the limit q — > the depicted structure is equivalent to the square 
lattice. If all connection s are replaced (q = 1) the rewiring process yields the well-investigated random regular 
graph ( Wormaldl . 1999h . Consequently, the set of these structures provides a continuous structural transition 
from the square lattice to random regular graphs. 

In random regular graphs the concentration of short loops vanishes as N — » oo (jWormaldl . Il98lh , therefore 



these structures become locally similar to a tree. In other words, the local structure is similar to the one 
characterizing the fictitious Bcthe lattice (existing in the limit N — *■ oo), on which analytical calculations 
can be performed thanks to translation invariance. Hence the results of simulations on large random regular 



47 



Fig. 15. Different rcguiar connectivity structures where each piayer has four neighbors: (a) square lattice, (b) regular "small- 
world" network, (c) Bethe lattice (or tree-like structure) and (d) random rcguiar graph. 



graphs can be compared with analytical predictions (e.g., the pair approximation) on the Bethe lattice, at 
least for those systems where the effect of large loops is negligible. 

Spatial lattices are full of short loops whose number decreases during the rewiring process. There exists a 
wide range of q, however, w here the two main features of small-world structures are present simultaneously 
( Watts and Strogatzl 1998). These properties arc (1) the small average distance I between two randomly 
chosen sites, and (2) the high clustering coefficient C, i.e., the high density of connections within the neigh- 
borhood of a site. Evidently, the average distance increases with N in a way that depends on the topology. 
For the sake of comparison, on a d-dimensional lattice T sa N 1 ^; on a complete (fully connected) graph 
1 = 1; and on random graphs I ps ]hN/ lnz where z denotes the average degree. When applying the method 
of random rewiring the small average distance (I w In N/ In z) can be achieved for a surprisingly low portion 
of random links (q > 0.01). 

For the structures in Fig. 15 the topological character of the graph cannot be characterized by the clustering 
coefficient, be cause C = (in the limit N — > oo) as mentioned above. In the original small-world model 
suggested by IWatts and Strogat d ( 199ct ) the initial structure is a one-dimensional lattice with periodic 
boundary conditions, where each site is connected to its z (even) nearest neighbors as shown in Fig. 16. 
During a rewiring process one of the ends of qzN/2 bonds is shifted to another site chosen at random. The 
final structure has sufficiently high clustering coefficient and sh ort average distances withi n a wide range of 
q. Further versions of small- world networks were suggested by Newman and Wattsl ( 19991 ). In the simplest 
structures random links arc added to a lattice. 





Fig. 16. Small-world structure (right) is created from a ring (left) with the nearest- and next-n earest neighbors. Random links 
are substituted for a portion q of the original links as suggested by IWatts a nd StrogatJ (Il998h . 
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In the initial structure (left graph in Fig. 16) the average clustering coefficient C = 1/2 for z = 4, and its 
value decreases linearly with q (if q « 1) and tends to a value of order 1/N as q — > 1. The analysis of these 
types of graphs becomes interesting for those evolutionary games, where the clustering or percolation of the 
overlapping cliques play a crucial role (see Sect. 6.8). 

5.3. Scale- free graphs 

In the above inhomogeneous connectivity structures the degree distribution f(z) has a sharp peak around 
z and the occurrence of sites with z » z is unlikely. There exists, however, a lot of real networks in 
nature (e.g., the internet, the network of acquaintance, collaborations, metabolic reactions, and many other 
biological networks), where the presence of sites with large z is essential and is due to fundamental processes. 

In recent years a lot of models have been developed to reproduce the main characteristics of these networks. 
Now we only discuss two procedures for growing networks exhibiting scale-free properties. Both procedures 
start with k connected sites and for each step t we add one site linked to m (different) existing sites as 
demonstrated in Fig. 17 for k = 3 and m = 2. 
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Fig. 17. C reation of two scale - free n etworks with the same average degree ({2} = 4) as suggested bv lBarabasi and Albert] l|l999h 
(top) and iDorogovtsev et al. I ll200ll ) (bottom). 



For the growth procedure suggested by Barabasi and Albert ( 19991 ) a site with m links is added to the 
system step by step, and the new site is preferentially linked to those sites that have large degrees already. 
To realize the "rich-gcts-richer" phenomenon the new site is linked to the existing site x with a probability 
depending on its degree 



n, 



Yjy Z V 



(98) 



After t steps this random graph has N = 3 + t sites and 3 + tm edges. For large t (or N) the degree 
distribution exhibits a power law behavior within a wide range of z, i.e., f{z) sa 2m 2 z~ 1 where 7 = 3, 
because older sites increase their degree at the expense of the younger ones. The average connectivity stays 
at (z) = m. 

In fact, network growth with linear preferential attachment naturally leads to power-law degree distri- 
butions. A more general family of growing networks can be introduced if the attachment probability II^ is 
modified as 



n, 



](Zx) 



(99) 

where g(z) > is an arbitrary function. The analytical calculations of lKrapivskv et al. I (120001) demonstrated 
that nonlinear preferential attachment destroys the power-law behavior. The scale-free nature of the growing 
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network can only be achieved if the attachment probability is asymptotically linear, i.e., Tl x ~ az x as z — > oo. 
In this case the exponent 7 can be tuned to any value between 2 and 00. 

In the Barabasi- Albert model (as well as in its varian ts) the clustering coefficie nt vanishes as N — > 00. 
On the contrary, C remains finite in many real networks. iDorogovtsev et al.l (|200l[ ) have suggested another 
procedure to create growing graphs providing more realistic clustering coefficients. In this procedure one 
site is added to the system in each step in such a way that the new site is linked to both ends of a 
randomly chosen edge. The degree distribution of this structure is similar to those found in the Barabasi- 
Albert model. Apparently these two gr owth procedures yield ve ry similar graphs for small t as illustrated 
in Fig. 17. The algorithm suggested by Dorogovtsev et al. < 200lh . however, creates at least one triangle for 
each step, therefore it results in a finite clustering coefficient (C ~ 0.5) for large N . 

For realistic networks the growin g process is affected b y the aging of vertices and also by the cost of linking 



or the limited capacity of a vertex ( Amaral et al. . 2000h . The mentioned phenomena prevent the formation 



of scale-free degree distribution by yielding fast decrease in the probability for sufficiently large degrees. 
Evidently, many models of network growth were developed during the last years, and their investigation have 
already become an extensive area within statistical physics. Nowadays this research covers the investigation 
of time-dependent networks too. 



5.4. Evolving networks 



The simplest possible time-dependent connectivity structure is illustrated by the right hand plot in Fig. 13. 
Here co-players are chosen randomly with equal probability in each round, making the mean-field approxi- 
mation valid in the limit N — > 00. Partially random connectivity structures with some temporary co-players 
are convenient from a technical point of view. For example, if randomly chosen co-players are substituted in 
each round for a portion q of the standard neighbors on the square lattice, then spatial symmetries are con- 
served on average, and the generalized mean-field technique discussed in Appendix C can be safely applied. 
The increase of q can induce a transition from spatial to mean-ficld-typc behavior as it will be discussed 
later on. 



Dynamically evolv i ng networks in general are t he subject of intensive research in many fields (jBoccaletti et al 



20061 : 1 Jackson! 2005 ; Dutta and Jackson . 120031) . Here we only focus on models where the growth and/or 



rewiring of the network is dictated by an evolutionary game play ed over the evolving social structure. Some 
early works on this subject include Skvrms and Pemantld (2000) who investigated various dynamic models 
of network formation assuming reinforcement learning. Choosing partners to play with in the next round 
was assumed to depend on a probability distribution over possible co-players, which was updated according 
to achieved payoffs in the former round. Even models with simple constant reinforcement rules were able 
to demonstrate how structure can emerge from uniformity or, under different model parameters, how uni- 
formity may develop from an initial structure. In more complicated games like the Stag Hunt game with 
possible discounting and/or noise the co-evolution of network and strategies can strongly depend on the rel- 
ative speed of the two fundamental updating processes. It was found that when strategy evolution is faster 
than structural evolution there is a tendency to coordinate on the risk dominant equilibrium (hunting hare) 
as in the mean-field case. In the opposite limit of fast network evolution the payoff-dominant equilibrium 
outcome (hunting stag) becomes more likely. In between the two limits the population is typically split into 

disconnected clusters of stag or ha re hunters which coexist in the s o ciety. 

In a conceptually similar model IZimmermann et al. I (|2000L 120011) : ISantos et all (<2006al) considered what 
happens if the links of a social graph can be removed and/or replaced by other links. They also assumed 
that strategies and structure evolve simultaneously with different rates. The (rare) cancelation of a link 
depended on the total payoff received by the given pair of players. Their model becomes interesting for 
the Prisoner's Dilemma when an unsatisfied defector breaks the link to neighboring defectors with some 
probability and look s for other random l y chosen partne r s. This mechanism conserves the number of links. In 
models suggested bv lBielv et al. ( 2007 ): Pacheco et al. ( 2006a ) the unsatisfied players are allowed to cancel 



links (to defectors), and the coo perators can create ne w links to one of the second neighbors suggested by 
the corresponding first neighbor. IPacheco et al. ( 2006fj ) have developed an elegant mean-field type approach 
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assuming finite life times for the links and allowing preferential search for new links. The main predictions 
of these models will be discussed in Sec. 6.9. 

For many real systems the migration of players plays a crucial role. It is assumed that during a move in the 
graph the player conserves her original strategy but faces a new neighborhood. There are several ways how 
this feature can be built into a model. If the connectivity structure is defined by a lattice (or graph) then 
we can introduce empty sites where neighboring players can move to. Depending on the microscopic details 
this addi tional feature can eithe r decrease or increase the frequency of cooperators in the spatial Prisoner's 
Dilemma Vainstein et al. I (|2007l) . The effects of local migration can be studied by using a more convenient 
way: two neighboring players (chosen randomly) are allowed to exchange their positions. In Section 8 we will 
consider the consequences of tuning the relative strength between imitation and site exchange (migration). 
In continuous space the random mig ration of players can be studied using reaction diffusion equations 
( Hofbauer et al. . 1997 : Wakanol . 2006f ) which exhibit either traveling waves or self-organizing patterns. 



6. Prisoner's Dilemma 



By now the Prisoner's Dilemma has become a world-wide known paradigm for studying the emergence of 
cooperative (altruistic) behavior in communities consisting of selfish individuals. The name of this game as 
well as the traditional notation are explained in Appendix A. 7. 

In this two-player game the players have two options to choose from which are called defection (D) and 
cooperation (C). For mutual cooperation the players receive the highest total payoff shared equally. The 
highest individual payoff is reached by a defector against a coopcrator. In the most interesting case the 
defector's extra income (related to the income for mutual cooperation), T — R, is less than the relative loss 
of the cooperator. R— S. According to classical (rational) game theory both players should play defection, 
since this is the Nash equilibrium. However, this outcome would provide them with the second worst income, 
thus creating the dilemma. 

There are other Social Dilemmas iMacv and Flachej (|2002h : ISkvrmsl (j2003l ) where mutual cooperation 
could provide the highest total income, although selfish individual reasoning often leads to other choices. 
For example, in the Snowdrift game (see Appendix A. 11) it is better to choose the opposite of what the 
other player does; in the Stag Hunt game (see Appendi x A. 12) the player is better off doing wha t ever the 
co-pl ayer does. In the last years several authors (e.g., ( Santos et al. . 2006bl : Hauert et al. ' bodd iHauertl 
2006)) have studied all these dilemmas in a unified framework by expanding the range of payoff parameters. 



In the following our discussion will focus on the Prisoner's Dilemma which is the most prominent example 
- other social dilemmas arc less favorable to defection. 

People face frequently the situation of Prisoner's Dilemmas in real life when they have to cho ose between 
to be selfish o r altruistic, to keep the ethica l norm s or not, to work (study) hard or lazy, etc (| Alexander . 
19871 : lAxelrodl . Il986t iBinmord . Il994l : iTriverj . Il985h . Disarmament and some business negotiations are also 



burdened with this t y pe of dilemma. The traffic flow models and game theo retic concepts were combined by 
Hclb ing et al. I (|2005l) ; |Percl (|2007t) . iDugatkin and Mesterton-Gibbonsl (|l996h have shown that the Prisoner's 



Dilemma can be recognized in the behavior of hermaphroditic fish alternately releasing eggs and sperm, be- 
cause the production of eggs implies a higher metabolic investment. The application of this game, however, 
is not restricted to human or animal societies. Wh en considering the intera ctions between two bacterio- 
phages, living and repro ducing within infected cells , Turner and Chao ( 1999h have identified the defective 
and cooperative versions (jNowak and Sigmundl . [l999l) . Th e investigation of two-dimensional model s helps our 
understanding how altruistic behavior occurs in biofilms (|Kreftl . l2004l : lMacLean and Gudelil . I2001) . In a bio- 
chemical example ( Pfeiffer et all [2001 ) the two different pathways of the adenosine triphosphate (ATP) pro- 
duction represent cooperators (low rate but high yield of ATP production) and defectors (high rate but low 
yield). Some authors speculate that in multicellular organisms cells can be considered as behaving c oopera- 
tively, except for tumor cells that shifted towards a more selfish behavior (jPfeiffer and Schuster L 20051). Game 
theoretical methods may provide a promising new ap proach for understanding cancer (jGatenbv and Maini , 
120031 : iFrick and Schuster! . 120031 IXxelrod et all 120061 ). 

Despite the prediction of classical game theory the emergence of cooperation can be observed in many 
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naturally occurring Prisoner's Dilemma situations. In the subsequent sections we will discuss the possibilities 
and conditions how cooperative behavior can subsist in multi-agent models. During the last decades it turned 
out that the rate of cooperation is affected by the strategy set, the evolutionary rule, the payoffs, and the 
structure of connectivity. For such a high degree of freedom the analysis cannot be complete. In many cases 
investigations are carried out for a payoff matrix with a limited number of variable parameters. 



6.1. Axelrod's Tournaments 



In the late 1970's, a computer tournament were conducted by Robert Axelrodl (jl980al lbl. fl98l . whose 
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interest in game theory arose essentially from a deep concern about international politics and especially 
the risk of nuclear war. Axclrod wanted to identify the conditions under which cooperation could emerge 
in a Prisoner's Dilemma, therefore he invited game theorists to submit strategies (in the form of computer 
programs) for playing an Iterated Prisoner's Dilemma game with the payoff matrix 

(100) 

The computer tournament was structured as a round robin game. Each player was paired with all other 
ones, then with its own twin (the same strategy) and with a random strategy choosing cooperation and 
defection with equal probability. Each iterated game between fixed opponents consisted of two hundred 
rounds, and the entire round robin tournament was repeated five times to improve the reliability of the 
scores. Within a tournament the players were allowed to take into account the history of the interactions as 
had developed thus far. All these rules had been announced well before the tournament. 

The highest average score was achieved by the so-called " Tit-for-Tat" strategy developed by Anatol 
Rapoport. This was the simplest of all the fourteen strategies submitted. Tit-for-Tat is a strategy which 
cooperates on the first round, and thereafter repeats what the opponent has done on the previous move. 

The main results of this computer tournament were published and people were soon invited to submit 
other strategies for a second tournament. The new strategies were developed in the knowledge of the fourteen 
old strategies, therefore a large portion of them would have performed substantially better than Tit-for-Tat 
in the environment of the first tournament. Surprisingly, Tit-for-Tat won the second tournament too. 

Using this set of strategies some further systematic investigations were performed to examine the robust- 
ness of the results. For example, the tournaments were repeated with some modifications in the po pulation 
of strategies. The most remarkable investiga tions are related to the adaption of a dynamical rule (jTriversl . 
19711 : iDawkinj . Il976l : iMavnard Smitbl . Il978l) that mimics Darwinian selection. This evolutionary computer 



tournament was started with Q strategies, which were represented by a portion of players in the limit 
N — > oo. After a round robin game the score for each player was evaluated, and this quantity served as 
a fitness to determine the abundance of strategics in the subsequent generation. By repeating these steps 
one can determine the variations in the population of strategies step by step. In most of these simulations, 
the success of Tit-for-Tat was confirmed because the evolutionary process almost always ended up with a 
population of some mutually cooperating strategies prevailed by Tit-for-Tat. 

Nevertheless, these investigations gave numerical evidence that there is no absolutely best strategy in- 
dependent of the environment. The empirical success of Tit-for-Tat is related to some of its fundamental 
properties. Tit-for-Tat does not wish to exploit others, it tries to maximize the total payoff. On the other 
hand, Tit-for-Tat cannot be exploited (except for the first step), because it reciprocates defection (as a pun- 
ishment) until the opponent unilaterally cooperates and thus compensates her by the largest income. One 
can think that Tit-for-Tat is a forgiving strategy because its choice of defection is determined by the last 
choice of its co-player only. A Tit-for-Tat strategy is identifiable easily, and its future behavior is predictable, 
therefore for Iterated Prisoner's Dilemmas the best strategy against Tit-for-Tat is to cooperate always. On 
the other hand, exploiting strategies are suppressed by Tit-for-Tat because of their mutual defection se- 
quences. Consequently, in evolutionary games the presence of Tit-for-Tat promotes mutual cooperation in 
the whole community. 
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The main conclusions of these pioneering works were collected in a paper b y Axelrod and Hamilton! (|198ll ). 
and many technical details were presented in the Axelrod's book (jAxelrodl . 1 19841 ) . The relevant message for 
people facing a Prisoner's Dilemma can be summarized as follows: 

(1) Don't be envious. 

(2) Don't be the first to defect. 

(3) Reciprocate both cooperation and defection. 

(4) Don't be too clever. 

In short, the above game theoretical investigations suggest that people wishing to benefit from a Prisoner's 
Dilemma situation should follow a strategy similar to Tit-for-Tat. The application of this strategy (in 
repeated games) provides a way to avoid the "Tragedy of the Community" . 

Anyway if the payoff of a one-shot Prisoner's Dilemma game can be divided into small portions then it is 
usually very useful to transform the game into a Repeated Prisoner's Dilemma game with an uncertainty in 
the ending (because of the applicability of Tit-for-Tat). This approach has been used successfully in various 
political negotiations. 

Beside the deduction of the above very important results, the evolutionary approach also gave insight 
into the mechanisms and processes resulting in the spontaneous emergence of cooperation. Here it is worth 
recalling that the results were concluded from a set of numerical investigations with a rather limited number 
of strategies (14 and 64 in the first and second tournaments), whose construction includes eventualities. For 
the purpose of a more quantitative and reliable analysis, we will briefly describe another approach, which is 
based on a well-defined, continuous set of strategies capable of representing a remarkably reach variety of 
interactions. 



6.2. Emergence of cooperation for stochastic reactive strategies 



The su ccess of the Tit-for-Tat strategy in Axelrod's computer tournament inspired iNowak and Sigmund 
(Il989al lbl. Il99nh to study a set of stochastic reactive strategies, where the choice of action in a given round 
is only affected by the opponent's behavior in the previous round. These mixed strategies are described by 
three parameters, s = (u,p,q), where u is the probability to choose cooperation in the first round, while p 
and q are the conditional probabilities to cooperate in later rounds, given that the opponent's previous choice 
was cooperation or defection, respectively. So, if a player using strategy s = (u,p,q) is matched against an 
opponent with a strategy s' = (u',p', q') then they will cooperate with probabilities 

u\ = u and u\ = u (101) 

in the first step, respectively. In the second step the probability of cooperation becomes 

u 2 =pu + q(l - u') 

u' 2 =p'u + q'(l-u) . (102) 
In subsequent steps the variation of these quantities is given by the recursive relation 

u n +2 = vu n + w , 

u 'n+2 = vu 'n + w ' , (103) 

where 

v = (p - q)(p' - q') , 
w=pq +q(l -q') , 
w'=p'q + q'(l-q) . 

According to this recursive process, if \p — q\, \p' — q'\ < 1 then the probabilities of cooperation tend towards 
the stationary values, 
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q+{p- q)q' 

l-(p-q)(p'-q') 

q' + jp' - q')q 

l-(p-q)(p'-q') 



(104) 



independently of u and u' . In this case we can neglect the notation of u and label strategies with p and q 
only, s = s(p,q). 

Using the expressions for u and u' the expected income of strategy s against s' reads 

A(s, s') = Ruu' + Su(l - v!) + T(l - u)u' + P(l - u)(l - u') , (105) 

at least, when the stationary state is reached. For a homogeneous system consisting of strategy s only, the 
average payoff obeys a simple form, 

A{s, s)=P+(T + S- 2P)u + {R-S -T + P)u 2 , (106) 

which has a maximum at p = 1 (nice strategies cooperating mutually forever) as illustrated in Fig. 18. 
Notice that A(s,s) becomes a non-analytical function at s = (1,0) (Tit-for-Tat strategy), where the payoff 
depends on the initial probability of cooperations u and v! . 



A11C 




Fig. 18. Average payoff if all players follow an s = (p, q) strategy for T = 5, R = 3, P = 1, and S = 0. The function has a 
singularity at s = (1,0) (TFT). 



In their simulation Nowak and Sigmund ( 19921 ) used Axelrod's payoff matrix in Eq. (100) and the above 



stationary payoff functions. Their numerical investigation aimed to clarify the role of Tit-for-Tat (TFT) and 
Generous (Forgiving) Tit-for-Tat (GTFT) strategies (see Appendix B fo r a description) . Deterministic Tit- 
for-Tat has a weakness when playing against itself in a noisy environment (|Axefrodl . ll984f ): after any mistakes 
two (deterministic) Tit-for-Tat strategies would choose alternately to defect and to cooperate in opposite 
phases. This shortcoming is suppressed by Generous Tit-for-Tat, which, instead of retaliating defection, 
chooses cooperation against a previous ly defecting opponent with a nonzero probability q. 

The evolutionary game suggested bv lNowak and Sigmund (1992) started with Q = 100 strategies, with p 



and q parameters chosen at random and with the same initial strategy concentrations. Now we will discuss 
this approach using strategies distributed equidistantly on the p — q plane. 

Consider an evolutionary Prisoner's Dilemma game with Q = 15 2 = 225 stochastic reactive strategies 
Si = {pi,qi) with parameters pi = 0.01 + 0.07fci and qi = 0.01 + 0.07^2 where k\ = i (modl5) and &2 = 
i — 15/ci for i = 0, 1, . . . , Q — 1. In this discretized strategy space, (0.01, 0.01) is the closest approximation of 
the strategy A11D (unconditional defection), and (0.99, 0.99) is that of Tit-for-Tat. Besides these, this space 
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also includes several Generous Tit-for-Tat strategies, e.g., (0.99, 0.08), (0.99, 0.15), etc. Initially all strategies 
have the same concentration, p Si (0) = l/Q. In subsequent steps (t = 1,2,.. .) the concentration for each 
strategy is determined by its previous payoff, 



p Si (t + l) = 



p ei (t)E( 3i ,t) 



where the total payoff for strategy Sj is given as 
E{s l ,t) = Y J Ps ] {t)A[si(t), Sj {t)}; 



(107) 



(108) 



and A[si(t),Sj(t)} is defined by Eq. (105) for T = 5, R = 3, P = 1, and S = 0. Equation (107) is the 
discretized form of the replicator equation in the Maynard Smith form in Eq. (42). In the present system 
the payoffs are positive and the evolutionary rule yields exponential decrease in the concentration for the 
worst strategics. The evolutionary process is illustrated in Fig. 19. 




Fig. 19. Distribution of the relative strategy concentrations for different t values illustrated in the plots. The height of columns 
indicates the relative concentrations for the corresponding strategies excepting those where p s (t)/ p B (0) < 1. In the upper-left 
plot the crosses show the distribution of (Q = 225) strategies on the p — q plane. 

Figure 19 shows that the strategy concentrations near (0, 0) grow rapidly due to the large income received 
from the exploited, and thus diminishing strategies near (1,1). After a short transient process the system 
is dominated by defective strategies whose income tends towards the minimum, P = 1. This phenomenon 
remin ds us of the following ancient Chinese script (circa 1000 B.C.) as interpreted bv lWilhelm and Bavnesl 
<ll977h : 

"Here the climax of the darkening is reached. The dark power at first held so high a place that it could 
wound all who were on the side of good and of the light. But in the end it perishes of its own darkness, for 
evil must itself fall at the very moment when it has wholly overcome the good, and thus consumed the energy 
to which it owed its duration. " 

During the extinction process the few exceptions are the remnants of the Tit-for-Tat strategies near (1, 0) 
cooperating with each other. Consequently, after a suitable time the defector's income becomes smaller than 
the payoff of Tit-for-Tats, which grows up at the expense of defectors as shown in Fig. 20. 

The increase of the population of Tit-for-Tat is accompanied by a striking increase in the average payoff as 
demonstrated in Fig. 20. However, the present stochastic version of Tit-for-Tat cannot provide the maximum 
possible total payoff forever. The evolutionary process do es not terminate by the prevalence of Ti t-for-Tat, 
because this state can be invaded by Generous Tit-for-Tat (jAxelrod and DiorJ . ll988l : lNowaMll990tJ ). The left 
plot of Fig. 20 shows the nearly exponential growth of (0.99, 0.08) and the exponential decay of (0.99, 0.01) 
around t as 750. 

A stability analysis helps to understand the relevant features of this behavior. In Fig. 21 the gray territory 
indicates those strategies whose homogeneous population can be invaded by A11D. Evidently, one can also 
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Fig. 20. Time dependence of the strategy concentrations (left) for three relevant strategies (solid line: (0.01,0.01); dotted line: 
(0.99,0.01)); dashed line: (0.99,0.08)) playing relevant role until i = 1000 and the variation of average payoff (right). 

determine the set of strategies s' which are able to invad e a homogeneous system wi th a given strategy s. 
The complete stability analysis of this system is given bv lNowak and Sigmund (|l990h . 



» 0.5 




Fig. 21. Defectors (0, 0) can invade a homogeneous population of stochastic reactive strategies belonging to the gray territory. 
Within the hatched region p = q > 0, while along the inside boundary p = q = 0. The boundaries are evaluated for T = 5, 
S = 3, P = 1, and S = 0. 

In biological applications it is na t ural t o assume that mutant strategies only differ slightly from their 
parents. Thus iNowak and Sigmund (1990) assumed an Adaptive Dynamics, and in the spirit of Eq. (54) 
they introduced a vector field in the p — q plane pointing towards the preferred direction of infinitesimal 
evolution, 



P 



dA{s, s') 



dp 



dA(s,s') 



(109) 



It is found that both quantities become zero along the same boundary separating positive and negative 
values. In Fig. 21 the hatched region shows those strategies where p, q > 0. In the above simulation (see 
Fig. 19) the noisy Tit-for-Tat strategy (0.99,0.01) is within this region, therefore a homogeneous system 
develops into a state dominated by a more generous strategy with higher q value. Thus, for the present set of 
strategies, the strategy (0.99,0.01) will be dominated by (0.99,0.08), that will be overcome by (0.99,0.15), 
and so on, as illustrated by the variation of strategy frequencies in Fig. 22. 

The evolutionary process stops when the average value of q reaches the boundary defined by p = q = 0. 
In the low noise limit, p — > 1, p and q vanishes at q ~ 1 — (T — R)/(R — S) (= 1/3 for the given payoffs). 
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Fig. 22. Time-dependence of the concentration of the strategies dominating in some time interval. From left to right the peaks 
are reached by the following p and q values: (0.01,0.01), (0.99,0.01), (0.99,0.08), (0.99,0.15), (0.99,0.22), (0.99,0.29), and 
(0.99,0.36). The horizontal dotted line shows the maximum when only one strategy exists. 

On the other hand, this state is invasion-proof against A11D until q < (R — P) / '(T — P) (=1/2 here). Thus 
the quantity 



min 1 



T-R R-P 



R- S T-P 



(110) 



(for p ~ 1) can be considered as the optimum of generosity (forgiveness) in the week noise limit (IMolander . 
19851 ). A rigorous discussion, in cluding the detailed s t ability analysis of stoc hastic reactive strategies in a 
more general context is given in iNowak and Sigmund (|l989al) : iNowakl (Il990al lbh. 



6.3. Mean-field solutions 

Consider a situation where the players interact with a limited number of randomly chosen co-players 
within a large population. For the sake of simplicity, we assume that a portion p of the population follows 
unconditional cooperation (A11C, to be denoted simply as C), and a portion 1 — p plays unconditional 
defection (A11D, or simply D). No other strategies are allowed. This is the extreme limit where players have 
no memory of the game. 

Each player's income comes from z games with randomly chosen opponents. The average payoff for 
cooperators and defectors are given as 



U c = Rzp + Sz{\- P ) , 
U D =Tzp+Pz{\- p) , 

and the average payoff of the population is 

U = pU c + {l- P)U D . 
With the replicator dynamics Eq. (41), the variation of p satisfies the differential equation 

p = p(U c -U) = p(l - p)(U c - U D ) . 



(Ill) 



(112) 



(113) 



In general, the macro-level dynamics satisfies the approximate mean value equation Eq. (80), which in 
the present case reads 



p = (1 - p) w(D C) - pw(C -> D) . 
In the case of the Smoothed Imitation rule Eq. (64) with (65), the transition rates are 



(114) 
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w(C^D) = (l-p) 



l + exp[(U c -U D )/K] ' 

w (D^C) = p— (115) 

1 + exp [{U D - Uc)/K\ 

and the dynamical equation becomes 

p = p(l-p)tanh ^ C 2A ^ j . (116) 

In both cases above, p tends to as t — > oo since Ud > Uc for the Prisoner's Dilemma independently 
of the value of z and K. This means that in the mean-field case cooperation cannot be sustained against 
defectors with imitative update rules. Notice furthermore, that both systems have two absorbing states, 
p = and p = 1, where p vanishes. 

For the above dynamical rules we assumed that players adopt one of the co-player's strategy with a proba- 
bility depending on the payoff difference. The behavior is qualitatively different for innovative strategies such 
as Smoothed Best Response (Logit or Glauber dynamics), where players update strategies independently of 
the actual density of the target strategy. For two strategies Eq. (71) gives 

w(C -► D) 



1 + exp [(U c - U D )/K] 



w{d - c) = i + c M(u d -u c)/ kt (U7) 

where K characterizes the noise as before. In this case the dynamical equation becomes 

p=(l-p) w(D -> C) - pw{C -> D) 

= w{D ->C)- p , (118) 

and the corresponding stationary solution satisfies the implicit equation, 

p = w{D^C) (119) 



1 + exp (z[(T - R)p + (P- S)(l - p)]/K) 

In the limit K — > this reproduces the Nash equilibrium p = 0, whereas the frequency of cooperators in- 
creases monotonously from to 1/2 as if increases from to oo (noise-sustained cooperation). In comparison 
with the algebraic decay of cooperation predicted by Eqs. (113) and (116), Smoothed Best Response yields 
a faster (exponential) tendency towards the Nash equilibrium for K = 0. On the other hand, for finite noise 
the absorbing solutions are missing. 

The situation changes drastically when the players can follow Tit-for-Tat strategies or they arc positioned 
on a lattice and the interaction is limited to their neighborhood as will be detailed in the next sections. 

6.4. Prisoner's Dilemma in finite populations 



Following the work bv lNowak et al. I d2004h we will discuss what happens in a population of N players who 



can choose AUD or Tit-for-Tat (TFT) strategies, and play n games with each other within a round. In this 
case the effective payoff matrix can be given as 

a =(::) <™> 

where a = Pn, b = T + P(n — 1), c = S + P{n — 1), and d = Rn. Notice that the original rank of 
order can be modified by n. Consequently, both A11D and TFT become strict Nash equilibria (and ESSs) if 
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n > (T — P)/(R — P). The deterministic replicator dynamics for infinite populations predicts that TFT is 
eliminated by natural selection if its initial frequency is less than a critical value x* = (a — c)/(a — b— c + d) 
( otherwise the f requen cy of A11D tends to zero). 

iNowak et al. ( 2004h have considered a stochastic (Moran) process when at each time step a player is 
chosen for reproduction with a probability proportional to her payoff and her offspring is substituted for a 
randomly chosen player. This stochastic process is investigated in the so-called weak-selection limit, when 
only a small part of payoff (fitness) comes from games, i.e., 



U D =l-w + w[a(i - 1) + b(N - i))/(N - 1) 
U T = l-w + w[ci + d(N - i - 1)}/(N - 1) , 



(121) 



where the strength of selection is characterized by w (0 < w << 1), and the number of A11D and TFT 
strategies are denoted by i and (N — i). This approach resembles the high temperature limit and seems t o 
be adequate for biological systems where strategy changes have little effect on overall fitness (jOhtal . 120021 ). 
The elementary steps can increase or decrease the value of i by 1 with probabilities and given as 



Mi- 



U D i(N - i) 



N[iU D + (N — i)U c ] ' 

U c i(N-i) 
N[tU D + (N -i)U c ] ' 



(122) 



This Moran process is equivalent to a random walk on sites i = 0, . . . , N. The evolu tion is stopped when 
the system reaches one the two absorbing states: i = and i = N. INowak et al.l (|2004r ) have determined the 
fixation probability vt that a single TFT strategy will invade the whole population containing (N — 1) A11D 
strategies at the beginning. It is well known that vt = 1/N if w = 0. Thus selection favors TFT invading 
A11D if vt > 1/N. For the limit of large N and weak selection the analytical result predicts that TFT is 

favored if x* < 1 /3 (this is the so-called 1/3 rule). 

When investigating the same system Antal and Scheurind ( 20061 ) have found that the average fixation 



time tfi X is proportional to A^lniV if the system has at least one ESS, and tfi, 
two strategies coexist in the limit N — > oo. 



/[VN(N - 1)] if the 



6.5. Spatial Prisoner's Dilemma with synchronized update 



In 1992 Nowak and May introduced a spatial evolutionary game to demonstrate that local interactions 
within a spatial s t ructu r e can maintain cooperative behavior indefinitely. In the first model they proposed 
(|Nowak and MavL Il992l Il993h . the players, located on the sites of a square lattice, could only follow two 
memoryless strategies: A11D (to be denoted simply as D) and A11C (or simply C), i.e., unconditional defection 
and unconditional cooperation, respectively. In most of their analysis they used a simplified, rescalcd payoff 
matrix, 



l S ^) = c 1 i m o(c ? 



b 
1 



(123) 



which only contains one free parameter b, but is expected to preserve the essence of the Prisoner's Dilemma l 12 1 
In each round (time step) of this iterated game the players play a Prisoner's Dilemma with all their 
neighbors, and the payoffs are summed up. After this, each player imitates the strategy of those neighboring 
players (including herself) who has scored the highest payoff. Due to the deterministic, simultaneous update 
this evolutionary rule defin es a cellu l ar aut omaton. [For a recent survey about the rich behavior of cellular 
automata, see the book by I Wolfram! (|2002h .] 



12 The case c = is sometimes called a "W eak" Prisoner's Dilemm a, where not only (D,D), but (D,C) and (C,D) are also 
Nash equilibria. Nevertheless, it was found iNowak and Mavl \l99j) that when played as a spatial game the weak version has 
the same qualitative properties as the typical version with c < 0, at least when |c| << 1. 
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Systematic investigations have been made for different types of neighborhoods. In most cases the authors 
considered the Moore neighborhood (involving nearest and next-nearest neighbors) with or without self- 
interactions. The computer simulations were started from random initial states (with approximately the 
same number of defectors and cooperators) or from symmetrical starting configurations. Wonderful sequence 
of patterns (kaleidoscopes, dynamic fractals) were found when the simulation was started from a symmetrical 
state (e.g., one defector in the sea of cooperators). 

Figure 23 shows typical patterns appearing after a large number of time steps with random initial con- 
ditions. The underlying connectivity structure is a square lattice with periodic boundary condition. The 
players' income derives from nine games against nearest- and next-nearest neighbors and against the player's 
own strategy as opponent. These self- interactions favor cooperators, and become relevant in the formation 
of a number of specific patterns. 

fe= 1.05 



6=1.77 



Fig. 23. Stationary 80x80 patterns indicating the distribution of defectors and cooper ators on a square lattice with synchronized 
update (with self-, nearest-, and next-nearest- neighbor interactions) as suggested bv lNowak and May! lll992Tl . The simulations 
are started from random initial states with different values of b as indicated. The black (white) pixels refer to defectors 
(cooperators) who have chosen defection (cooperation) in the previous step too. The dark (light) gray pixels indicate defectors 
(cooperators) who invaded the given site in the very last step. 

In these spatial models defectors receive one of the possible payoffs Un = 0, b, . . . , zb depending on 
the given configuration, whereas the cooperators' payoff is Uc = 0, 1, . . . , z (or Uc — 1, ...,£ + 1 if self- 
interaction is included), where z denotes the number of neighbors. Consequently, the parameter space for 
b can be divided into small equivalency regions by break points, bb p = fe/fei, where k\ = 2,3, ... ,z and 
&2 = 3,4, ...,z [and (z + 1) for self-interaction]. Within these regions the cellular automaton rules are 
equivalent. Notice, however, that the strategy change at a given site depends on the actual configuration 
within a sufficiently large surroundings, including all neighbors of neighbors. For instance, for the Moore 
neighborhood this means a square block of 5 x 5 sites. As a result, for 1 < b < 9/8 one can observe a frozen 
pattern of solitary defectors, whose typical spatial distribution is plotted in Fig. 23 for b = 1.05. 

In the absence of self-interactions the pattern of solitary defectors can be observed for 7/8 < b < 1. For 
b > 1, however, solitary defectors always receive the highest score, therefore their strategy is adopted by 
their neighbors in the next step. In the step after this the payoff of the defecting offsprings is reduced by 
interacting with each other, and they flip back to cooperation. Consequently, for 1 < b < 7/5 we find that 
the 3x3 block of defectors shrink back to their original one-site size. In the sea of cooperators there are 
isolated islands of defectors with fluctuating size, 1x1^3x3. These objects are easily recognizable in 
several snapshots of Fig. 23. This phenomenon demonstrates how a strategy adoption (learning) mechanism 
for interacting players can prevent the spreading of defection (exploitation) in a spatial structure. 

In the formation of the patterns shown in Fig. 23, the invasion of cooperators along horizontal and 
vertical straight line fronts plays a crucial role. As demonstrated in Fig. 24, this process can be observed for 
b < 5/3 (or b < 2 in the presence of self- interactions) . The invasion process stops when two fronts surround 
defectors who form a network (or fragments of lines) as illustrated in Fig. 23. In some constellations the 
defectors' income becomes sufficiently high to be followed by some neighboring cooperators. However, in 
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the subsequent steps, as described above, coopcrators strike back, and this leads to local oscillations (limit 
cycles) as indicated by the grey pixels in the snapshots. 
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Fig. 24. Local payoffs for two distributions of defectors (black boxes) and cooperators (white boxes) if z — 8. The cellular 
automaton rules yield that cooperators invade defectors along the horizontal boundary for the left hand constellation if b < 5/3. 
This horizontal boundary is stopped if 5/3 < b < 8/3, whereas defector invasion occurs for b > 8/3. Similar cooperator invasions 
can be observed along both the horizontal and vertical fronts for the right hand distribution, except for the corner, where three 
defectors survive if7/5<6<5/3. 

There exists a region of b, namely 8/5 < b < 5/3 (or 9/5 < b < 2 with self-interactions included), where 
consecutive defector invasions also occur, and the average ratio of cooperators and defectors is determined by 
the dynamical balance between these competing invasion processes. Within this region the average concen- 
tration of strategies becomes independent of the initial configuration, except for some possible pathologies. 
Otherwise, in other regio ns, the time-dependence of the strategy frequen cies and the limit values depend on 
the initial configuration ( Nowak and May . 1993 ; Schweitzer et all l2002h . Figure 25 compares two scries of 
numerical results obtained by varying the initial frequency p(t — 0) of cooperators on a lattice consisting 
of 10 6 sites. These simulations are repeated 10 times to demonstrate that the standard deviation of limit 
values increases drastically for low values of p(t = 0). The triangles represent typical behavior when the 
system evolves into a "frozen (blinking) pattern". 
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Fig. 25. Monte Carlo data for the average frequency of cooperators as a function of initial cooperator's frequency for b = 1.35 
(triangles) and 1.65 (squares) in the absence of self-interaction. 

The visualization of the evolution makes clear that cooperators have a fair chance of survival if they form 
compact colonics. For random initial states the probability of finding compact cooperator colonies decreases 
very fast with their concentration. In this case the dynamics becomes very sensitive to the initial configura- 
tion. The last snapshot in Fig. 23 demonstrates that cooperators forming a k x I (k, I > 2) rectangular block 
can remain alive even for 5/3<6<8/3 (or 2 < b < 3 for self-interactions), when cooperator invasions along 
the horizontal and vertical fronts are already blocked. 

Keeping in mind that the stationary frequencies of strategies depend on the initial configuration, it is 
useful to compare the numerical results obtained in the presence or absence of sclf-intcractions. As indicated 



61 



in Fig. 26 the coexistence of coopcrators and defectors is maintained for larger b values in the case of 
self-interactions. These numerical data (obtained on a large box containing 10 6 sites) have sufficiently low 
statistical error to demonstrate the occurrence of break points. We have to emphasize, that for frozen 
patterns the variation of p(t = 0) in the initial state can cause significant deviation for the data denoted by 
open symbols. 
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Fig. 26. Average (non-vanishing) frequency of cooperators as a function of b in the model suggested bv lNowak a nd Mav ( 1992). 
Diamonds (squares) represent Monte Carlo data obtained for a Moore neighborhood on the square lattice in the presence 
(absence) of self-interactions. The simulations start from a random initial state with equal frequency of cooperators and 
defectors. Closed symbols indicate states, where the coexistence of cooperators and defectors is maintained by a dynamical 
balance between opposite invasion processes. Open symbols refer to frozen (blinking) patterns. The arrows at the bottom (top) 
indicate break points in the absence (presence) of self-interactions. 

The results obtained via cellular automaton models have raised a lot of questions, and inspired people to 
develop a wide variety of models. For example, in subsequent papers iNowak et al. I (Il994aflrj ) studied different 
spatial structures including triangular and cubic lattices and a random grid, whose sites were distributed 
randomly on a rectangular block with periodic boundary condition with a range of interaction limited by a 
given radius. It turned out that cooperation can be maintained in spatial models even for some randomness. 

The rigorous comparison of all the results obtained under different conditions is difficult, because most of 
the models were developed without knowing all previous investigations published in different areas of science. 
The conditions for the coexistence of the C and D strategies and some quantitative features of these states 
on a square lattice we re studied systematically for more general, two-parameter payo f f matr ices containing 
b and c parameters by iLindgren and Nordahll ( 1994 ) ; IrlauertJ (|2001 ) ; Schweitzer et al. ( 2002[ ) . Evidently, on 
the two-dimensional parameter space the break points form crossing straight lines. 

The cellular automaton model with nearest-neighbor interactions was studied by IVainstein and Arenzon 
(|200lh on diluted square lattices (see the left hand structure in Fig. 14). If too many sites are removed the 
spatial structure fragments into small parts, on which coopcrators and defectors survive separately from 
each other, and the final strategy frequencies depend on the initial state. The most interesting result was 
found for small q values. The simulations have indicated that the fraction of cooperators increases with q if 
q < q t h — 0.05. In this case the sparsely removed sites can play the role of sterile defectors who block further 
spreading of defection. 



Cellular automaton mod els o n random network s were also studied by I Abramson and Kupermanl (|200ll ). 
iMasuda and Aihara (|2003l) . and lDuran and Mulet] (|2005l ). These investigations highlighted some interesting 
and general features. It was observed that local irregularities in the small-world structures or random graph 
shrink the region of parameters where frozen patterns occur (Class 2 of cellular automata). On random 
graphs this effect may prevent the formation of frozen patterns if the the average value of connectivity C 
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is large enough (|Diiran and Muled . l2005h . Furthermore, the simulations also revealed at least two factors 
which can increase the frequency of cooperators. Both an inhomogeneity in the degree distribution and a 
s ufficiently large clu stering coefficient can support cooperation. 



Kim et al.l (|2002n studied a cellular automaton model on a small-world graph, where many players are 



allowed to adopt the strategy of an influential player. On this inhomogeneous structure the asymmetric role 
of the players can cause large fluctuations, triggered by the strategy flips of the influential site. Many aspects 
of this phenomenon will be discussed later on. 

Huge efforts were focused on the extension of the strategy space. Most of these investigations studied 
what happens if the players are allowed to follow a finite subse t of stochastic r e active strategies [see e.g., 
Nowak et aD |l994a| ): lLmdgren and Nordahll (jl994h : lGriml (jl995h : iBrauchli et al.l (|l999f )]. It was shown that 



the addition of some Tit-for-Tat strategy to the standard D and C strategies support s the prevalenc e of 
cooperation in the whole range o f payoff parameters. A similar result was earlier found bv lAxelrodl (|l984) for 
non-spatial games. iNowak et alj (|1994a ) demonstrated that the application of three cyclically dominating 
strategies leads to a self-organizing pattern, to be discussed later on within the context of the spatial Rock- 
Scissors- Pap_er game. 



Griml ( 19951 1996h studied a cellular automaton model with D, C, and several Generous Tit-for-Tat 
strategies, characterized by different values of the q parameter. The chosen strategy set included those 
strategies that play a dominating role in Figs. 19, 20, and 22. Besides deterministic cellular automata 
Grim also studied the effects of mutants introduced in small clusters after each step. In both cases the 
time-dependence of the strategy population was very similar to those plotted in Figs. 20 and 22. There 
was, however, an important difference: the spatial evolutionary process ended up with the dominance of a 
Generous Tit-for -Tat strategy, whose generosity parameter ex c eeded the op timum of the mean-field system 
by a factor of 2 (jMolanded . Il985l ; INowak and Sigmundl . Il992t I Griml . Il996f ). This can be interpreted as an 
indication that spatial effects encourage greater generosity. 

Within the family of spa tial multi-strategy evolutionary Prisoner's Dilemma games the model introduced 
bv lKillingback et all (|l999l ) represents another remarkable cellular automaton. For this approach a player's 
strategy is characterized by an investment (incurring some cost for her) that will be beneficial for all her 
neighbors including herself. This formulation of the Prisoner's Dilemma is similar to Public Good games as 
described in Appendix A. 8. The cellular automaton on a square lattice was started with (selfish) players 
having low investment. The authors demonstrated that these individuals can be invaded by mutants who 
have higher investment rate due to spatial effects. 

Another direction towards which cellular automaton models were extended involves the stochastic cellular 
automata discussed briefly in Sec. 6.1. T he correspond i ng spat i al Prisoner's Dil e mma g ame was introduce d 
and investigat ed by many authors, e.g., Nowak et al. ( 1994af ): iMukherii et al.l (Il996 ); Kirchkampl ( 2000l ); 
Hauerd (|2002[) . and further references are given in the paper by ISchweitzer et al.l (|2005h . The stochastic 



elements of the rules are capable of destroying local regularities, which appear in deterministic cellular 
automata belonging to Class 2 or 4. As a result, stochasticity modifies t he phas e boun dary separating quan- 
titativ ely different behaviors in the parameter space. In the papers bv lHauerd (|2002h and ISchweitzer et al 



(|2005l ) the reader can find phase diagrams obtained by numerical simulations. The main features caused 
by these types of randomness are similar to those observed for random sequential update. Anyway, if the 
strategy change suggested by the deterministic rule is realized with low probability then the corresponding 
dynamical system becomes equivalent to a model with random sequential update. In the next section we 
will show that random sequential update usually provides a more efficient and convenient framework for 
studying the effects of payoff parameters, noise, randomness, and different connectivity structures on the 
emergence of cooperation. 



6.6. Spatial Prisoner's Dilemma with random sequential update 

In m any cases random sequential update provides a more realistic approach as detailed bv lHuberman and Glance] 
(|1993I ). Evolution starts from a random initial state, and the following elementary strategy adoption steps 
are repeated: choose two neighboring players at random and the first player (at site x) adopts the second's 
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strategy (at site y) with a probability P{s x — > Sj,) depending on the payoff difference. This rule belongs to 
the general (smoothed) imitation rules defined by Eq. (64). In this section we use Eq. (65) and write the 
transition probability as 



1 



1 + exp[(£/ x - U y )/K] ■ 



(124) 



Our principal goal is to study the effect of noise K. 

In order to quantify the differences between synchronized and random sequential update, in Fig. 27 we 
compare two sets of numerical data obtained on square lattices with nearest- and next-nearest neighbor 
interactions (z = 8) in the absence of self- interactions. First we have to emphasize that "frozen" (blinking) 
patterns cannot be observed for random sequential update. Instead, the system always develops into one 
of the homogeneous absorbing states with p = or p = 1, or into a two-strategy co-existence state being 
independent of the initial state. 
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Fig. 27. Average frequency of cooperators as a function of b, using synchronized update (squares) and random sequential 
updates (diamonds) for nearest- and next-nearest interactions. Squares are the same data as plotted in Fig. 26, diamonds are 
obtained for K = 0.03 within the coexistence region of b. 

The most striking difference is that for noisy dynamics cooperation can only be maintained until a lower 
threshold value of b. As mentioned above, for synchronized update cooperation is strongly supported by the 
invasion of cooperators perpendicular to the horizontal or vertical interfaces. For random sequential update, 
however, the invasion fronts become irregular, favoring defectors. At the same time the noisy dynamics is 
capable to eliminate solitary defectors who appear for low values of 6, as illustrated in Fig. 27. 

The left snapshot of Fig. 28 shows that solitary defectors can create an offspring positioned on one of the 
neighboring sites for 7/8 < b < 1 in the absence of self-interactions. As the parent and the offspring mutually 
reduce each other's income, one of them becomes extinct within a short time, and the survival is ready to 
create another offspring. Due to this mechanism defectors perform random walks on the lattice. These moving 
objects collide and one of them is likely to be destroyed. On the other hand, there is some probability that 
they split into two objects. In short, these processes are analog ous to branching and coalescing r andom walks, 
thoroughly investigated in non-equilibrium statistical physics. ICardv and Tauberl (|1996L Il998h have shown 
that these systems exhibit a non-equilibrium phase transition (more precisely an extinction process) when 
the parameters are tuned. There the transition belongs to the so-called directed percolation universality class. 

A very similar phenomenon can be observed at the extinction of cooperators in our spatial game. In the 
vicinity of the extinction point the cooperators form colonies (see right snapshot of Fig. 28) who also perform 
branchin g and coalescing random walks. Consequently, both ext inction processes exhibit the same universal 
features ( Szabo and Tokd . ll998l ; IChiappin and de Qliveiral[l999l ). However, the occurrence of these universal 
features of the extinction transition is not restricted to random sequential update. Similar properties have 
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Fig. 28. Distribution of defectors (black pixels) and coopcrators (white areas) on a square lattice for 6 = 0.95 (left) and b = 1.036 
(right) if K = 0.1. 



also been found in several stochastic cellular aut omata (synchronous update) (jDomanv and Kinzell . 11984 ; 
Bidaux et all . Il989t IJensenl . Il99lt I Wolfram! . [200I . 

The universal properties of this type of non-equilibrium critical pha se transitions have been intensively 
i nvest igated. For a survey see the book by iMarro and Dickman (|l999l) or the review paper by Irlinrichsenl 
( 2000l ). The very robust features of this transition from the active phase p > to the absorbing phase (p = 0) 
occur in many homogeneous spatial systems wh ere the order parameter is scalar and the interactions are 
short ranged ( Janssen . 1981 : Grassbergei , 1982). The simplest model exemplifying this transition is the 
contact process proposed bv Irlarris ( 1974 ) as a toy model to describe the spread of an epidemic. In his 
model healthy and infected objects are localized on a lattice. Infection spreads with some rate through 
nearest- neighbor contacts (in analogy to a learning mechanism), while infected sites recover at a unit rate. 
Due to its simplicity, the contact process allows for a very accurate numerical investigation of the universal 
properties of the critical transition. 

In the vicinity of the extinction point (here b cr ), the frequency of the given strategy plays the role of the 
order parameter. In the limit N — > 00 it vanishes as a power law, 



p cx 



b\ 







(125) 



where (3 = 0.583(4) in two dimensions, independently of many irrelevant details of the system. The algebraic 
decrease of p is accompanied by an algebraic divergence of the fluctuations of p, 



X = N[(p 2 {t))-(p{t)f]K\b c 
as well as of the correlation length 

£oc |6 cr -6|-^ 
and the relaxation time 
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For two-dimensional systems the numerical values of the exponents are: 7 = 0.35(1), u±_ = 0.733(4), and 
v\\ = 1.295(6). Further univ ersa l properties and te chniques to study these transitions are well described in 



Marro and Dickman ( 19991 ) and Hinrichsen (2000|) (and further references therein) 

The above divergencies cause serious difficulties in numerical simulations. Within the critical region the 
accurate determination of the average frequencies requires large system sizes, L » £, and simultaneously 
long thermalization and sampling times, Uh,t s » r. In typical Monte Carlo simulations the suitable 
accuracy for p ~ 0.01 can be achieved by run times as long as several weeks on a PC. 

The peculiarity of evolutionary games is that there may be two subsequent critical transitions. The 
numerical confirmation of the universal features becomes easier, if the spatial evolutionary game is as simple 
as possible. The simplicity of the model also allows us to perform the generalized mean-field analyses, using 
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larger and larger cluster sizes (for details see Appendix C). Therefore, in the rest of this section our attention 
will be focused on those systems, where the players only have four neighbors, and the evolution is governed 
by sequential strategy adoptions, as described at the beginning of this section. 

Figure 29 shows a typical variation of p, when we tune the value of b within the region of coexistence 
for K = 0.4. These data refer to two critical transitions at b c \ = 0.9520(1) and at b C 2 = 1.07277(2). The 
log-log plot in the inset provides numerical evidence that both extinction processes belong to the directed 
percolation universality class. 




b 



Fig. 29. Average frequency of cooperators as a function of b on the square lattice for K = 0.4. Monte Carlo results are denoted 
by squares. Solid, dashed, and dotted lines indicate the predictions of the generalized mean-field technique at the level of 3 X 3-, 
2 X 2-, 2-site clusters. Inset shows the log-log plot of the order parameter $ vs. \b — b cr \, where <E> = 1 — p or $ = p in the 
vicinity of the first and second transition points. The solid line has the slope j3 = 0.58 characterizing directed percolation. 



In Fig. 29 the Monte Carlo data are compared with the results of the generalized mean-field technique. The 
large difference between the Monte Carlo data and the prediction of the pair approximation is not surprising, 
if we know that traditional mean-field theory (one-site approximation) predicts p = at b > 1 for any K 
as detailed above. As expected, the prediction improves gradually for larger and larger clusters. The slow 
convergence towards the exact (Monte Carlo) results for increasing cluster sizes highlights the importance 
of short range correlations related to the occurrence of solitary defectors and cooperator colonies. According 
to these results, the first continuous (linear) transition appears at b = b c \ < 1. As this transition lies outside 
the parameter region of the Prisoner's Dilemma (b > 1), we do not consider this in the sequel. 

It is worth mentioning that the four-site approximation predicts a first-order phase transition, i.e., a sudden 
jump to p = at b = b C 2, while the more accurate nine-site approximation predicts linear extinction at a 
threshold value 6^ s) = 1.0661(2), very close to the exact result. We have to emphasize that the generalized 
mean-field method is not able to reproduce the exact algebraic behavior for any finite cluster sizes, although 
its accuracy increases gradually. Despite this shortcoming, this technique can be used to estimate the region 
of parameters where strategy coexistence is expected. 

By the above techniques one can determine the transition point b C 2 > 1 for different values of K. The 
remarkable feature of the results (see Fig. 30) is the extinction of cooperators for b > 1 in the limits when 
K goes to or oo (jSzabo et all 120051 ). In other words, there exists an optimum value of noise to maintain 
cooperation at the highest possible level. This means that noise plays a crucial role in the emergence of 
cooperation, at least in the given connectivity structure. 

The prediction of the pair approximation cannot be illustrated in Fig. 30 because the corresponding 
data lie outside the plotted region. The most serious shortcoming of this approximation is that it predicts 
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Fig. 30. Monte Carlo data (squares) for the critical value of b as a function of K on the square lattice. The solid and dashed 
lines denote predictions of the generalized mean-field technique at the levels of 3 X 3- and 2 X 2-site approximations. 

limx^o frc2 = 2- Notice, however, that the generalized mean-field approximations are capable to reproduce 
the correct result, limif_>o b^ C ^ = 1, at higher levels. 

The occurrence of noise enhanced cooperation can be demonstrated more strikingly when we plot the 
frequency of cooperators against K at a fixed b < max(6 C 2). As expected the results show two subsequent 
critical transitions belonging to the directed percolation universality class, as demonstrated in Fig. 31. 
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Fig. 31. Frequency of cooperators has a maximum when increasing K for b = 1.03. Inset shows the log-log plot of the frequency 
of cooperators vs. \K — K c \ (K c i = 0.0841(1) and K c 2 = 1.151(1)). The straight line indicates the power-law behavior 
characterizing directed percolation. 



Using a different approach, iTraulsen et al.l (|2004f ) also found that the learning mechanism (for weak 
external noise) can drive the system into a state, where the unfavored strategy appears with an en- 
hanced frequency. They showed that the effect can be present in other games (e.g. the Matching Pennies) 
too. The frequency of the unfavored strategy reaches its maximum at intermediate noise intensities in a 
resonance-like man ner. Simil ar phenomena, called stochastic resonance, are widely stud ied in climatic change 
(iBenzi et allll982l). )iology ( iDouelass et~atl.ll993!) ecol ogy (jBlarer and Doebelil . ll999| l. and excitable media 
( Jung and Maver-Kressl fl995l : iGammaitoni et allll998t l when considering the respo nse to a periodic exter- 
nal force. The presen t phenomenon, however, is more similar to coherence resonance ( Pikovskv and Kurthsl 
19971; ITraulsen et "all . 12004 IPerd . 120051 ; IPerc and Marhl 120061) because no periodic force is involved here. 
In other words, the enhancement in the frequency of cooperators appears as a nonlinear response to purely 
noisy excitations. 
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The analogy be comes more striking when we also consider additional payoff fluctuations. In a recent 
paper iPerc (2006) studied an evolutionary Prisoner's Dilemma game with a noisy dynamical rule at very 
low value of K, and made the payoffs stochastic by adding some noise £, chosen randomly in the range 
— a < ( < a. When varying a the numerical simulations indicated a resonance-like behavior in the frequency 
of cooperators. His curves, obtained for a somewhat different paramctrization, are very similar to those 
plotted in Fig. 31. 

The disappearance of cooperation in the zero noise limit fo r the present topolo gy is contrary to naive 
expectations that could be deduced from previous simulations ( Nowak et al. . 1994a[). To furthe r clarif y the 
reasons, systematic investigations were made for different connectivity structures ([Szabo et all 120051 ). Ac- 
cording to the simulations, this evolutionary game exhibits qualitatively similar behavior if the connectivity 
is characterized by a simple square or cubic [z = 6) lattice, a random regular graph (or Bethe lattice), or a 
square lattice of four site cliques (see the right hand side of Fig. 32). Notice that these structures are prac- 
tically free of triangles, or the overlapping triangles form larger cliques that are not connected by triangles 
but by single links (see the top structure on the right in Fig. 32). 

In contrast with these, cooperation is maintained in the zero noise limit on two-dimensional triangular 
(z = 6) and kagomc (z = 4) lattices, on the three-dimensional body centered cubic lattice (z = 8) and on 
random regular graphs (or Bethe lattice) of one-site connected triangles (see the right hand side of Fig. 32). 
Similar behavior was found on the square lattice when second-neighbor interactions are also taken into 
account. The common topological feature of these structures is the percolation of the overlapping triangles 
over the whole system. The exp l oratio n of this topological feature of networks has begun very recently 
( Palla et al. . 20051 Derenvi et al. . 2005 ). and for the evolutionary Prisoner's Dilemma it seems to be more 
important than the spatial character of the connectivity graph. In order to separate the dependence on the 
degree of nodes from other topological characteristics in Fig. 32 we compare results obtained for structures 
with the same number of neighbors z = 4. It is worth mentioning that these qu alitative diffe r ences are also 



confirmed by the generalized mean-field technique for sufficiently large clusters (jSzabo et all 120051 ) 




Fig. 32. Critical values of b vs. K for five different connectivity structures characterized by z = 4. Monte Carlo data arc indicated 
by squares (square lattice), pluses (random regular graph or Bethe lattice), Q (lattice of four-site cliques), A (kagome lattice), 
and v (regular graph of overlapping triangles). The last three structures are illustrated above from top to bottom on the 
right-hand side. 

In the light of the above results it is conjectured that for the given dynamical rule cooperation can 
be maintained in the zero noise limit only for those regular connectivity structures, where the overlapping 
triangles span an infinitely large portion of the system for N — ► oo. For low values of K the highest frequency 
of cooperators is found on those random (non-spatial) structures, where two overlapping triangles have only 
one common site (see Fig. 32). 



68 



The advantageous feature of these topologies (jVukov et al 1 l2006h can be explained most easily on those 



z = 4 structures where the overlapping triangles have only one common site (the middle and bottom 
structures on the right hand side of Fig. 32). Let us assume that originally only one triangle is occupied by 
cooperators in the sea of defectors. Such a situation is plotted in Fig. 33. 





Fig. 33. Schematic illustration of the spreading of cooperators (white dots) within the sea of defectors (black dots) on a network 
built up from one-site overlapping triangles. These structures are locally similar to those plotted in the middle and bottom 
structures on the right-hand side of Fig. 32. Arrows show the direction of the most frequent transitions in the low noise limit. 

The payoff of cooperators within a triangle is 2, neighboring defectors receive b, and all the other defectors 
get 0. For this constellation the most likely process is that one of the neighboring defectors adopts the more 
successful cooperator strategy. As the state of the new coopcrator is not stable, she can return to defection 
again within a short time, unless the other neighboring defector changes to cooperation too, provided that 
b < 3/2. In the latter case we obtain another triplet of cooperators. This is a stable configuration against 
the attack of neighboring defectors for low noise. Similar processes can occur at the tip of the branches, if 
the overlapping triangles form a tree. Consequently, the iteration of these events gives rise to a growing tree 
of overlapping cooperator triplets. The growth process, however, is blocked at defector sites separating two 
branches of the tree. This blocking process is missing if only one tree of cooperator triplets exists like on a 
tree-like structure. Nevertheless, it constraints the spreading of cooperators on spatial structures (e.g., on 
the kagome lattice). The blocking process controls the spreading of cooperators (and assure the coexistence 
of cooperation and defection) for both types of connectivity structures, if many trees of coopcrator triplets 
are simultaneously present in the system. Evidently, more complicated analysis is required to clarify the role 
of different topological properties for z > 4, as well as for cases when the overlapping triangles have more 
than one common sites. 

The above analysis also suggests some further interesting conclusions (conjectures). In the large K limit, 
simulations show similar behavior for all the investigated spatial structures. The highest rate of cooperation 
occurs on random regular graphs, where the role of loops is negligible. In other words, for high noise the 
(quenched) randomness in the connectivity structure provides the highest rate of cooperation, at least, if 
the structure of connectivity is restricted to regular graphs with z = 4. 

According to preliminary results some typical phase diagrams on the K-b plane are depicted in Fig. 34. 
The versions on the top have already been discussed above. The bottom left diagram is expected to occur 
on the kagome lattice if the strategy adop t ion pr obability from some players (with fixed position) is reduced 
by a suitable factor i Szolnoki and Szabo . 20071) . The fourth type (bottom right) is characteristic to some 



one-dimensional systems. 



6.7. Two- strategy Prisoner's Dilemma game on a chain 

First we consider a system with players located on the sites a; of a one-dimensional lattice. The players 
follow one of the two strategies, s x = C or D, and their total utility comes from a Prisoner's Dilemma 
game with their left and right neighbors. If the evolution is governed by some imitative learning mechanism, 
which involves nearest-neighbors, then a strategy change can only occur at those pair of sites, where the 
neighbors follow different strategies. At these sites the maximum payoff of cooperators, 1, is always lower 
than the defectors' minimum payoff b. Consequently, for any Darwinian selection rule cooperators should 
become extinct sooner or later. 
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Fig. 34. Possible schematic phase diagrams of the Prisoner's Dilemma on the K—b plane. The curves show the phase transitions 
between the coexistence region (C + D) and homogeneous solutions (C or D). The K — > oo limit reproduces the mean-field 
result b c i = b c 2 = 1 for three plots, the exception represents a behavior occurring for some inhomogeneous strategy adoption 
rates. 



In one-dimensional models the imitation rule leads to a domain growi ng; process (where o ne of the strategies 



is favored) driving the system towards the homogeneous final state. iNakamaru et alJ (|1997h studied the 



competition between D and Tit-for-Tat strategies under different evolutionary rules. It is demonstrated 
that the pair approximation predicts correctly the final state dependent on the evolutionary rule and initial 
strategy distribution. 

Considering C and D strategics only, the behavior changes significantly if second neighbors are also 
taken into account. In this case the players have four neighbors and the results can be compared with 
those discussed above (see Fig. 32), provided wc choose the same dynamical rule. Unfortunately, systematic 
investigations are hindered by extremely slow transient events. According to the preliminary results, the two 
subsequent phase transitions coincide, b c \ = b C 2 > 1, if the noise exceeds a threshold value, K > K t h- This 
means that evolution reaches a homogeneous absorbing state of cooperators (defectors) for b < b c i = 6 C 2 
(b > b c i = b C 2). For lower noise levels, K < Kth, coexistence occurs in a finite region, b c i(K) < b < b C 2(K). 
The simulations indicate that b C 2 —> 1 ii K goes to zero. Shortly, for this connectivity structure the system 
represents anot her class of behavior, compl ementing those discussed in the previous section. 

Very recently ISantos and Pachecol (|2005f ) have studied a similar evolutionary Prisoner's Dilemma game 
on the one-dimensional lattice varying the number z of neighbors. In their model a randomly chosen player 
x adopts the strategy of her neighbor y (chosen at random) with a probability (U y — U x )/Uq (where Uq is 
a suitable normalization factor) if U y > U x . This is Schlag's Proportional Imitation rule, Eq. (62). It turns 
out that cooperators can survive if 4 < z < z max f» 64. More precisely, the range of & with coexi sting C and 
D str ategies decreases when z increases. Similar results arc reported by Ifti et al. ( 2004 ) and iTang et al 



( 20061 ) who used a different dynamical rules. As expected, a mean-field type behavior governs the system 
for sufficiently large z. 

The importance of one-dime nsional systems lies in the f act that they act as starting structures for small- 
world graphs as suggested by IWatts and Strogatz (1998), who substituted random links for a portion q 
of bonds connecting nearest- and next-nearest neighbors on a one-dimensional lattice (see Fig. 16). The 
long-range connections generated by this rewiring p rocess decrease the average distance between sites, and 
produce a small world phenomenon (jMilgraml . Il967l ). which characterizes typical social networks. 
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6.8. Prisoner's Dilemma on social networks 



The spatial Prisoner's Dilemma has been studied by many authors on different random networks. Some 
results for synchronized update were briefly discussed in Sec. 6.5. In Fig. 34 we have compared the phase 
diagrams for regular graphs, i.e., lattices and random graphs where the number of neighbors is fixed. However, 
the assumption of a uniform degree is not realistic for most social networks, and these models do not 
necessarily give adequate predictions. In this section we focus our attention on graphs where the degree has 
a non-uniform scale-free distribution. The basic message is that degree heterogeneity can drastically enhance 
cooperation. 

The emergence of cooperation around the largest hub was reported by iPacheco and Santod (l2005h . To 
understand the effect of heterogeneity in the connectivity network, first we consider a simple situation, 
which represents a typical part of a scale-free network. We restrict our attention to the surroundings of two 
connected central players ( called hubs), each li nked to a large number of other, less connected players, as 
shown in Fig. 35. Following lSantos et al. ( 2006c ). we wish to illustrate the basic mechanism how cooperation 
gets supported when a cooperator and a defector face each other in such a constellation. 





Fig. 35. A typical subgraph of a scale-free social network, where two connected central players are linked to many others having 
significantly less neighbors. 



For the sake of simplicity we assume that both the cooperator x and the defector y are linked to the same 
number N x = N y of additional co- players (xn and yn players), who follow the C and D strategies with 
equal probability in the initial state. This random initial configuration provides the highest total income for 
the central defector. The central cooperator's total utility is lower, but exceeds the income of most of her 
neighbors, because she has much more connections. We can assume that the evolutionary rule is Smoothed 
Imitation defined by Eq. (124), with the imitated player chosen randomly from the neighbors. The influence 
of the neglected part of the whole system on xn and yn players is taken into consideration in a simplified 
manner. We assume no direct interaction with the rest of the population but posit that xn and yn players 
adopt a strategy from each other with probability P rsa (rsa - random strategy adoption), because the strategy 
distribution in this subset is alike to that of the whole system. 

The result of a numerical simulation is shown in Fig. 36. At the beginning of the evolutionary process the 
frequency of cooperators increases (decreases) in the neighborhood of the central cooperator (defector), and 
this variation is accompanied by an increase (decrease) of the utility U x (U y ). Consequently, after some time 
the central cooperator will obtain the highest payoff and becomes the best player to be followed by others. 
A necessary condition for this scenario is N x 
the random effect of the surroundings characterized by P 1 
in the number of neighbors cannot modify this picture. 



N y » 1 which can be satisfied in scale- free networks, ant that 
should not be too strong. A weak asymmetry 



Thi s mechanism seems to work well for all scale-free networks, as was reported by iSantos and Pacheco 
( 2005 ) ; Santos et al. ( 2006c ) ; Santos and Pacheco! ( 2006h , who observed a significant enhanc ement in the fre- 
quenc y of c ooperators. The stat i onary frequency of cooperators were determined both for the lBarabasi and Albert] 
( 1999t ) and Dorogovtsev et al. ( 2001 ) structures (see Fig. 17). Cooperation remains high in the whole re- 
gion of b, especially for the Dorogovtsev-Mendes-Samukhin network (see Fig. 37). To illustrate the striking 
enhancement, these data are compared with the results obtained for two lattices at the optimum value of 
noise (i.e., K = 0.4 on the square lattice and K = on the kagome lattice) in Fig. 37. 
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Fig. 36. Frequency (probability) of cooperators as a function of time for the topology in Fig. 35 at the central sites x and y, 
as well as at their neighbors (xn and yn). Initially s x = C and s y = D, and the neighboring players choose C or D with 
equal probability. The numerical results are averaged over 1000 runs for b = 1.5, Prsa = 0.1, and the number of neighbors 
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Fig. 37. Fre quency of cooperator s vs. b for two different scale- free connectivity structures, where the average number of neighbors 
is (z) = 4 l lSantos an d Pachcco, 2005]). The Monte Carlo data are denoted by pluses (Barabasi- Albert model) and triangles 
(Dorogovtsev-Mendes-Samukhin model). Squares and the dashed line show p on the square and kagome lattices, resp., at a 
noise level K where b cr reaches its maximum. 



Santos et al.l (|2006d ) found no qualitative change when using asynchronous update instead of the syn- 



chronous rule. The simulations clearly indicated that the highest frequency of cooperation occurs on sites 
with a large degree. Furthermore, contrary to what was observed for regular graphs, the frequency of co- 
operators increases with (z). Conversely, cooperative behavior evaporates if the edges connecting the hubs 
are artificially removed. Considering an evolutionary Prisoner ' s Dile mma game with synchronized update 
(without irrational strategy adoption) iGomez-Gardenez et al. ( 2007 ) have found that cooperation remains 
unchanged in a vicinity of hubs. All these features support the conclusion that the appearance of connected 
hubs is responsible for enhanced cooperation in these structures. 

Note that in these simulations an imitative adoption rule was applied, and the total utility was posited 
to be simply additive. These elements favor cooperative players with many neighbors. Robustness of the 
above mechanism is questionable for other type of d ynamics like Best Response, or when the utility is 
non-additive. Indeed, using different evolutionary rules IWu et all (|2005br) have studied what happens on a 
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scale-free structure if the strategy adoption is based on the normalized payoff (payoff divided by the number 
of co-players) rathe r than the total payo ff. The simulations indicate that the huge increase in the cooperator 
frequency found in lSantos et al.l (|2006a ) is largely suppressed. 

A weaker, although well-detectable consequence of the inhomogeneous degree distribution was observed 
bv ISantos et al. (2005), when they compared the stationary frequency of cooperators on the clas sic (see 
Fig. 16) and regular version of the small-world structure suggested by Watts and Strogatz ( 1998h . They 
found that on regular structures (with z = 4) the frequency of cooperators decreases (increases) below 
(above) a threshold value of b, when the ratio of rewired links is increased. 

In some sense hubs can be considered as privileged players, whose interaction with the neighborhoo d 
is asymmetric. Such a feature was artificially built into a cellular automaton model by iKim et all (|2002l ). 
who observed large dro ps in the level of c ooperation after flippin g an influential site from c ooperation to 
defection. Very recently IWu et al. I (|2006l) . H en et alj (|2006h . and ISzolnoki and Szabol (|2007l ) have demon- 
strated numerically that inhomogencities in the strategy adoption rate can also enhance cooperation. Be- 
sides inhomogeneities, the percolation of the overlapping triangles in the connectivity structure can also 
give an additional boost for cooperation. Notice that the highest frequency of cooperators were found for 
the Dorogovtscv-Mendes-Samukhin structure (see Fig. 17), where the construction of the growing graph 
guarantees the percolation of triangles. 

All the above analysis is based on the payoff matrix given by Eq. (123). In the last years another 
parametrization of A is used frequently. Within this approach the cooperator pays a cost c for her op- 
ponent to receive a benefit b (0 < c < b). The defector pays nothing and does not distribute benefits. The 
corresponding payoff matrix is: 



A 



b 

—c b — c 



(129) 



Using this payoff matrix Ohtsuki et al. ( 20061 ) have studied the fixation of solitary C (and D) strategy 
on finite social networks for several evolutionary rules (e.g., "death-birth", "birth-death", and imitation 
updating). For example, for the elementary steps of "death-birth" updating a random player is chosen to 
die and subsequently the neighboring players compete for the empty site in proportion to their fitness. In 
the so-called weak-selection limit, defined by expression (121), the pair approximation can be performed 
analytically. The pair approximation predicts that the cooperator has on average one more cooperator 
neighbor than the defector, and the cooperators have a fixation probability larger than 1/N (i.e., C is favored 
in comparison to the voter model dynamics) if the ratio of benefit to cost exceeds the average number of 
neighbors (b/c > z). The results of simulations (performed on different networks) show sufficiently good 
agreement with the prediction of pair approximation although this approximation neglects the role of loops 
in the connectivity structure. Evidently, the accuracy of pair approximation is improved as z is increased. It 
is underlined that the mentioned simple rule seems to be similar to Hamilton's rule saying that kin selection 
can favor cooperatio n if b/c > 1/r, w h ere the social r elatedness r can be measured by the inverse of the 
number of neighbors ( Hamilto'nl . 1964al ). Nowakl (|2006bl ) has shown that the altruistic behavior emerges if the 
benefit-to-cost ratio (b/c) exceeds some threshold value for three other mechanisms called direct reciprocity, 
indirect reciprocity, and group selection. 

All the above phenomena indicate that there are possibly many ways to influence and enhance cooperative 
behavior in social networks. Further investigations would be required to clarify the distinguished role of noise, 
appearance of different time scales, and inhomogencities in these structures. 



6.9. Prisoner's Dilemma on evolving networks 



In real-life situations the preferential choice and refusal of partners play an important role in the emergence 
of cooperation. Severa l aspects of th i s process have been investigated by using differ ent dynamics in the 



mathematical models (lAshlock et al.. 1996 



see ( Hauk and Nagell . 12001 : Coricelli et al 



Bala and Govall . l2000l : Irlanaki et all 2006h and experimentally 



2004 ) and further references therein] for several years. 
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An interesting model where t he social network and the individual strategies co-evolve was introduced by 
Zimmermann et all d2000l 2004 ). The co nstruction of the system starts with the creation of a random graph 



[as suggested by ( Erdos and Renvi . 19591 )]. which serves as an initial connectivity structure with a prescribed 
average degree. The strategy of the players, positioned on this structure, can be either C or D. In discrete 
time steps (t = 1,2,.. .) each player plays a one-shot Prisoner's Dilemma game with all her neighbors and 
the payoffs are summed up. Knowing the payoffs, unsatisfied players (whose payoff is not the highest in their 
neighborhood) adopt the strategy of their best neighbors. This is the deterministic version of the Imitate 
the Best rule in Eq. (66). 

It is assumed that after the imitation phase unsatisfied players who have become defectors (and only these 
players) break the link with the imitated defector with probability p. The broken link is replaced by a new 
link to a randomly chosen player (self-links and multiple links are forbidden). This network adaption rule 
conserves the number of links and allows coopcrators to receive new links from the unsatisfied defectors. 
The simulation starts from a random initial strategy distribution, and the evolution of the network, whose 
intensity is characterized by p, is switched on after some thcrmalization time, typically 150 generations. 

The numerical analysis for N = 10 4 sites shows that evolution ends up either in an absorbing state with 
defectors only or in a state where coopcrators and defectors form a frozen pattern. In the final cooperative 
state the defectors are isolated, and their frequency depends on the initial state due to the cellular automaton 
rule. It turns out that modifications in (z) can only cause a slight variation in the results. The adaptation of 
the network leads to fundamental changes in the connectivity structure: the number of high degree sites are 
enhanced significantly, and these sites (leaders) are dominantly occupied by cooperators. These structural 
changes are accompanied by a signific ant increase in the fr e quenc y of cooperators. The final concentration 
of cooperators is as high as found by ISantos and Pachecol (|2005h (see Fig. 37) in the whole region of the 
payoff parameter b (temptation to defect). It is observed, furthermore, that social crisis, i.e., large cascades 
in the evolution of both str ucture and frequency of cooperators propagate throug h the network, if a leader's 
state is changed suddenly ( Zimmermann and Egufiu3 , 2005 : Equfluz et al. . 2005f l. 

The final connectivity structure exhibits hierarchical characters. The formation of this structure is preceded 
by the cascades mentioned above, in par t icular, for high values of b. This phenomenon may be related to 



the competition of leaders ([Santos et all l2006d ). discussed in the previous section (see Figs. 35 and 36). 



This type of network adaption rule could not cause a relevant increase in the clustering coefficient. In 
( Zimmermann et all 12004 : lEquiluz et al. . 2005 ) the network adaption rule was altered to favor the choice of 
second neighbors (the neighbors of the D neighbor who are companions in distress) with another probability 
p when selecting a new partner. As expected, this modification resulted in a remarkable increase in the final 
clustering coefficient. A particularly high increase was found for large b and smal l p val ues. 

Fundamentally different co-evolutionary dynamics were studied bv lBielv et al. ( 2007 ). The synchronized 
strategy adoption rule was extended by the possibility of cancellation and/or creation of new links with 
some probability. The number of removed and new links were limited by model parameters. In this model 
the players were capable to estimate the expected payoffs in subsequent steps by taking into account the 
possible variations in the neighborhood. In contrast to the previous model, here the number of connections 
did not remain fixed. The numerical simulations indicated the emergence of oscillations in some quantities 
(e.g., the frequency of cooperators and the number of links) within some region of the parameter space. For 
high frequency of the cooperators the network characteristics resemble those of scale-free graphs discussed 
above. The appearance of global oscillations reminds us of similar phenomena found for voluntary Prisoner's 
Dilemma games (see below). 

The above versions of co-evolutionary models are extended to involve stochastic eleme nts (irrational 
choices) in the evolu t ion of both the strategy distribution and the connectivity structure ( Santos et all 



2006ai iPacheco et all l2006al) . Such an adaptive individual behavior introduces two time scales (77 and t s ) 



associated with the typical lifetime of the links and individual strategics modified by imitation via pairwisc 
comparison. In the limit W = t s /ti — > this model reproduces the results achieved on quenched structure. In 
the opposite limit (W — ► 00) the players rea djust their connectivity s tructure for fixed strategy distribution. 

flbj) who assumed that the connections can 



tne opposite limit (W — > 00) tnc players readjust tneir connectivity 
This latter phenomenon was investigated bv lPacheco et al. ( 2006al ll 



disappear exponentially with a life time and new links are created between randomly chosen players with 
some rates. The characteristic death and birth rates of connections depend on the player's strategies and the 
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corresponding parameters will determine the stationary frequency of connections ($jy) between the s — s' 
strategy pairs (s,s' = D or C). In this case the slow evolution of the strategy frequencies can be described 
by a mean-field type equation based on a rescaled payoff matrix (A — > A'). In other words, under the mean- 
field conditions the players feel an effective payoff matri x with coefficients A' „, = A ss i<& ss i. Evidently, this 
transformation can change the ranking of payoff values. IPacheco et all (|2006al lbl) have demonstrated that 
the Prisoner's Dilemma can be transformed into a new game favoring cooperation definitely when choosing 
a suitable dynamics of linking. The numerical analysis indicates that similar consequences can be observed 
even for W = 1. Two schematic plots in Fig. 38 show the expansion of those region of parameters where 
cooperators dominates the system as W is increased from to 3. 



W=0 



W=3 




2 



Fig. 38. Red areas indicate regions of the S — T plane where cooperators prevail the system for two different values of W . Blue 
territories refer to defectors' dominance. The qua drants of Snowdrift, Stag Hunt, and Prisoner's Dilemma are denoted by the 
corresponding abbreviations llSantos et all l2006al) . 



In the multilevel selection model introduced by iTraulsen and Nowak (2006) TV players are divided into 
m groups. Individual incomes (fitness) come from games between members of the same group, i.e., the 
connectivity structure is a union of m disjoint complete subgraphs (cliques). At each time step of the 
evolution a player is chosen randomly from the whole population with a probability proportional to her 
income. The player's offspring is added to the same group. The players in the given group are divided into 
two groups randomly with a probability q if the group size reaches a critical value n. Otherwise one of the 
players is eliminated from this group (with a probability 1 — q) . The number of groups is fixed by eliminating 
a randomly chosen group whenever a new group is created. Although the evolutionary dynamics is based 
on individual fitness the selection occurs at two levels, and favors the fastest reproduction for both the 
individuals and the groups. This model can be interpreted as a hierarchy of two types of Moran processes, 
and has been analyzed analytically in the weak selection limit for low values of q. The model was extended by 
allowing migration betw een the groups with some pr obability. Using the payoff matrix (129) and considering 
the fixation probabilities ITraulsen and Nowakl (|2006l ) have shown that cooperators are favored over defectors 
if the benefit-to-cost ratio exceeds a threshold value, b/c > 1 + z + n/m, where z is the average number of 
migrants arising from a group during its lifetime. 

Finally we call the reader's attention to the work by IVainstein et al. I (120071) who surveyed the effect of 
player's migration on the measure of cooperation. The introduction of mobility drives the system towards 
the mean-field situation favoring the survival of defectors. Despite of this expectation Vainstein et al. have 
des cribed s e veral dynamical rules increasing the frequency of cooperators. Similar results were reported 
by lAktipisl ( 2004 ) who considered the effect of contingent movement of cooperators: the players walked 
away once a defection occurred in the previous step. In some sense, this "win-stay, lose- move" strategy is 
analogous to voluntary participation which also en hance cooperation ( Saiio and Yamatol 19991 Hauert et al 



20021 : ISzabo and Hauertl . l2002al : IWu et all l2005al ) as detailed in the subsequent section 



75 



6.10. Spatial Prisoner's Dilemma with three strategies 



In former sections we considered in detail the Iterated Prisoner's Dilemma game with two memoryless 
strategies, A11D (shortly D) and A11C (shortly C). Enlarging the strategy space to three or more strategics 
(now with memory) is a straightforward extension. The first numerical investigations in this genre focused 
on games with a finite subset of stochastic reactive strategies. As mentioned in Sec. 6. 2 one ca n easily select 
three st o chastic reactive strategies, for which cyclic dominance appears naturally as detailed bv lNowak et al 



(|l994ah : lNowak and Sigmundl (|l989b1 al). In the corresponding spatial games most of the evolutionary rules 
give rise to self-organizing patterns, similar to those characterizing spatial Rock-Scissors-Paper games that 
we discuss later in Sec. 7. 

In the absence of cyclic dominance, a three-strategy spatial game usually develops into a state where one 
or two strategies die out. This situation can occur in a model which allows for D, C, and Tit-for-Tat (shortly 
T) strategies (see Appendix B.2 for an introduction on Tit-for-Tat). In this case the T strategy will invade 
th e territory of D , and finally the evolution terminates in a mixture of C and T strategies, as was described 
bv lAxelrodl ( 19841 ) in the non-spatial case. In most spatial models, for sufficiently high b any external support 
of C against T yields cyclic dominance: T invades D invades C invades T. 

In real social systems a number of factors can favor C against T. For example, permanent inspection, 
i.e., keeping track of the opponents moves introduces a cost which reduces the net income of T strategists. 
Similar payoff reduction is found for Generous T players (see again Appendix B.2 for an introduction on 
Generous Tit-for-Tat) . In some populations the C strategy may be favored by education o r by the appearance 
of new, unexperienced generations. These effects can be built in to the payoff matr ix ( Imhof et all 12005? ) 
and/or can be handled via the introduction of an external force ( Szabo et all 2000,). Independently of the 
technical details, these evolutionary games have some common features that we discuss here for the Voluntary 
Prisoner's Dilemma and later for the Rock-Scissors-Paper game 



Th e idea of a Voluntary Prisoner's Dilemma game comes from Voluntary Public Good games (jHauert et al 



20021) or Stag Hunt games (sec Appendix A. 12), where the players are allowed to avoid exploitation. Namely, 



the players can refuse to participate in these multi-player games. Besides the traditional two strategies (D 
and C) this option can be considered as a third strategy called "Loner" (shortly L). The cyclic dominance 
between these strategies (L invades D invades C invades L) is due to the average payoff of L. The exper- 
i ment al verification that this leads to a Rock-Scissors-Paper-like dynamics is reported in ISemmann et al. 

(l2003h . 

In the spatial versions of Public Good games (jHauert et al.1 . 2002: S zabo and Hauertl . l2002bl) the players 
income comes from multi-player games with their neighborhood. This makes the calculations very compli- 
cated. However, all the relevant properties of these models can be reserved when we reduce the five- (or 
nine-) site interactions into pairwise interactions. With this proviso, the model becomes equivalent to the 
Voluntary Prisoner's Dilemma characterized by the following payoff matrix: 
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a 



(130) 



where the loner and her opponent share a payoff < a < 1. Notice that the sub-matrix corresponding to 
the D and C strategies is the same we used previously in Eq. (123). Similar payoff structure can arise for a 
job made in collaboration by two workers, if the collaboration itself involves a Prisoner's Dilemma. In this 
case, the workers can look for other independent jobs yielding lower income a, if any of them declines the 
common work. 

First we consider the Volun t ary Pr isoner's Dilemma game on a square lattice with nearest neighbor 
interaction ( Szabo and Hauertl . l2002al ). The evolutionary dynamics is assumed to be random sequential 



with a transition rate defined by Eq. (124). According to the traditional mean-field approximation, the 
time-dependent strategy frequencies follow the typical trajectories shown in Fig. 39, that is, evolution tends 
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towards the homogeneous loner state ! 13 1 A similar behavior was found by IHauert et al. (|2002t ). when they 
considered a Voluntary Public Good game. 




Fig. 39. Trajectories of the Voluntary Prisoner's Dilemma game predicted by the traditional mean-field approximation for 
a = 0.3, b = 1.5, and K = 0.1. 

The simulations indicate a significantly different behavior on the square lattice. In the absence of L 
strategies this model agrees with the system discussed in Sec. 4.2, where cooperators become extinct if 
b > b cr (K) (max(6 cr ) ~ 1.1). Contrary to these, cooperators can survive for arbitrary b in the presence of 
loners. 

Figure 40 shows the coexistence of all three strategies. Note that this is a dynamic image. It is perpetually 
in motion due to propagating interfaces, which reflect the cyclic dominance nature of the strategies. Nev- 
ertheless, the qualitative properties of this self-organizing pattern are largely unchanged for a given payoff 
matrix. 



DO C:D L-M 




Fig. 40. A typical distribution of the three strategies on the square lattice for the Voluntary Prisoner's Dilemma in the vicinity 
of the extinction of loners. 



It is worth emphasizing that cyclic dominance is not directly built into the payoff matrix (130). Cyclic 
invasions appear at the corresponding boundaries, where the role of strategy adoption and clustering is 
important. 

The results of a systematic analysis are summarized in Fig. 41 (jSzabo and Hauertl . l2002al) . Notice that 
loners cannot survive for low values of b, where the defector frequency is too low to feed loners. Furthermore, 
the loner frequency increases monotonously with 6, despite the fact that large b only increases the payoff of 
defectors. This unusual behavior is related to the cyclic dominance as explained later in Sec. 7.7. 



'Notice that L is attractive but unstable (see the relevant discussion in Sec. 3.3). 
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Fig. 41. Stationary frequency of defectors (squares), cooperators (diamonds), and loners (triangles) vs. b for the Voluntary 
Prisoner's Dilemma on the square lattice at ft" = 0.1 and a = 0.3. The predictions of the pair approximation are indicated by 
solid lines. Dashed lines refer to the region where this stationary solution becomes unstable. 



As b decreases the stationary frequency of loners vanishes algebraically at a critical point (jSzabo and Hauertl . 



2002aJ). In fact, the extinction of loners belongs to the directed percolation universality class, although the 



corresponding absorbing state is an inhomogeneous and time-dependent one. However, on large spatial and 
temporal scales, the rate of birth and death becomes homogeneous on this background, and these are the 
relevant criteria for belong ing to the robust universality class of directed percolation, as was discussed in 
detail bv iHinrichsenl (|2000h . 

The predictions of the pair approximation are also illustrated in Fig. 41. Contrary to the two-strategy 
model, here the agreement between the pair approximation and the Monte Carlo data is surprisingly good. 
Figure 29 nicely demonstrates that the two-strategy pair approximation can fail in those regions, where 
one of the strategy frequencies is low. This is, however, not the case here. For three-strategy solutions the 
accuracy of this approach becomes better. The stability analysis, however, reveals a serious shortcoming: 
the stationary solutions become unstable in the region of parameters indicated by dashed lines in Fig. 41. 
In this region the numerical analysis indicates the appearance of global oscillations in the frequencies of 
strategies. 

Although the shortcomings of the generalized mean-filed approximation on the square lattice diminish 
when we go beyond the pair approximation and consider higher levels (larger clusters), it would be enlight- 
ening to find other connectivity structures on which the pair approximation becomes acceptable. Possible 
candidates are the Bethe lattice and random regular graphs for large N (as discussed briefly in Sec. 5), 
because this approach neglects correlations mediated by four-site loops. In other words, the prediction of 
the pair approxima tion is more reliabl e if loop s are missing, as in the Bethe lattice. 

The simulations ( Szabo and Hauertl 2002al ) found a similar behavior to those on the square lattice (see 
Fig. 41) if b remains below a threshold value b\ depending on K. For higher b global oscillations were 
observed, in nice agreement with the prediction of the pair approximation, as illustrated in Fig. 42. To avoid 
confusion in this plot we only illustrate the frequency of defectors, obtained on a random regular graph 
with 10 6 sites. The plotted values (closed squares) are determined by averaging the defector's frequency 
over a long sampling time, typically 10 5 Monte Carlo steps (MCS). During this time the maximum and 
minimum values of the defector's frequency were also determined (open squares). The predictions of the 
pair approximation for the same quantities can be easily determined, and they were compared with the 
Monte Carlo data in Fig. 42. Both methods predict that the "amplitude" of global oscillation increases with 
b until a second threshold value &2- If b > 62, the strategy frequencies oscillate with a growing amplitude, 
and after some time one of them dies out. Finally the system develops into a homogeneous state. In most 
cases cooperators become extinct first, and then loners sweep out defectors. 
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Fig. 42. Monte Carlo results (closed squares) for the frequency of defectors as a function of b on a random regular graph for 
K = 0.1 and a = 0.3. The average frequency of defectors are denoted by closed squares. Open squares indicate the maximum 
and minimum values taken during the simulation. Predictions of the pair approximation are shown by solid (stable) and dashed 
(unstable) lines. 



Fu rther investigations of this model were aimed to clarify the effect of partial randomness (Sz abo and Vukov 
20041 ). When tuning the portion of rewired links on a regular small- world network (see Fig. 15), the self- 



organizing spatial pattern transforms into a state exhibiting global oscillations. If the portion of rewired 
bonds exceeds about ten percent (q > 0.1) the small- world effe ct leads to a behav ior similar to those found 
on random regular graphs. Using a differ ent evolutionary rule IWu et alj (|2005al ) reported similar features 
on the small- world structure suggested bv lNewman and Watts! ( 1999 ). 

In summary, the possibility of voluntary participation in spatial Prisoner's Dilemma games supports 
cooperation via a cyclic dominance mechanism, and leads to a self-organizing pattern. This spatio-temporal 
structure is destroyed by random links (or small-world effects) and synchronizes the local oscillations over 
the whole system. In some parameter regions the evolution is terminated in a homogeneous state of loners 
after some transient processes, during which the amplitude of the global oscillations increase. In this final 
state the interaction between the players vanishes, that is, they do not try to benef it from possible mutual 
cooperations. According to the classical mean-field calculation ( Hauert et al. . 2002f) an analogous behavior 
is expected for the connectivity structures shown in Fig. 13. These general properties are rather robust, 
similar behavior can be observed if the loner strategy is replaced by some Tit-for-Tat strategies in these 
evolutionary Prisoner's Dilemma games. 



6.11. Tag-based models 



Kin selection (|Hamiltonl . fl964al lbh can support cooperation among individuals (relatives), who mutually 
help each other. This mechanism assumes that individuals are capable of identifying their relatives to be 
donated. In fact, tags characterizing a group of i ndivid uals can facilitate selective interactions, leading to 
growing aggregates in spatial systems ( Holland . 19951). Such a tag- ma t ching mechanism was taken into 
account in the image scoring model introduced by iNowak and Sigmundl (|1998l ). and many other, so-called 
tag-based models. 

In the context of the Prisoner's Dilemma a simple tag-based model was introduced by iHalesl (|2000h . In 
his model each player possesses a memoryless strategy, D or C, and a tag, which can take a finite number of 
discrete values. A tag is a phenotypic marker like sex, age, c ultural traits, language, etc., or any other signal 



that individuals can detect and distinguish ( Robson . 199fj| ). Agents play a Prisoner's Dilemma game with 



random opponents, provided that the co-player possesses the same tag. This mechanism partitions the social 
network into disjunct clusters with identical tags. There is no play between clusters. Nevertheless, mutation 
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can take individuals from one tag cluster into another, or change her strategy with some probability. Players 
re produ c e in p roportion to their average payoff. 

Halesl (|2000h found that cooperative behavior flourishes in the model when the number of tags is large 



enough. Clusters with different tag values compete with each other in the reproductive step, and those that 
contain a large proportion of cooperators outperform other clusters. However, such clusters have a finite life 
time, since if mutations implant defectors into a tag cluster, they soon invade the cluster and shortly cause 
its extinction (within a cluster it is a mean-field game where defectors soon expel cooperators). The room 
is taken over by other growing cooperative groups. Due to mutations new tag groups show up from time 
to time, and if these happen to be C players they start to increase. At each step there are a small number 
of mostly cooperative clusters present in the system, i.e., the ave r age le vel of cooperation is high, but the 
prevailing tag values vary from time to time. Ijansen and Baalen (2006) have considered the maintenance 
of cooperation through the so-called beard chromodynamics in a spatial system where the tag (color) and 
behavior (C or D strategy) are inherited via loosely coupled genes. Similar models for finite populations are 
analyzed bv lTraulsen and Nowak ( 2007t ) (further references therein). 

A rather different tag-based model, similar in spirit to Public Good games (see Appendix A. 8), was 
suggested bv iRiolo et al. (200f). Each player x is characterized by a tag value t x G [0, f] and a tolerance 
level A x > 0, which constitute the player's strategy in a continuous strategy space. Initially the tags and the 
tolerance levels are assigned to players at random. For a given step (generation) each player is paired with 
all others, but player x only donates to player y if \t x — r y \ < A x . Agents with close enough tag values are 
forced to be altruistic. Player x incurs a cost c and player y receives a benefit b (b > c > 0) if the tag-based 
condition of donation (cooperation) is satisfied. At the end of each step the players reproduce: the income of 
each player is compared with another player's income chosen at random, and the loser adopts the winner's 
strategy (r, A). Durin g the strategy adop tion process mutations are allowed with a small probability. Using 
numerical simulations Riolo et alj (l200ll) demonstrated that tagging can again promote altruistic behavior 
(cooperation). 



In t he above model the strategies are characterized by two continuous parameters. iTraulsen and Schuster 
(2003) 'lave shown that the main features can be reproduced with only four discrete strategies. The minimal 
model has two types of players (species), say Red and Blue, and both types can possess either minimum 
(A = 0) or maximum (A = I) tolerance levels. Thus, (Red,l) and (Blue, I) donate to all others, while (Red,0) 
[resp., (Blue,0)] only donates to (Red,0) and (Red,l) [resp., (Blue,0) and (Blue,!)]. The corresponding payoff 
matrix is 



(131) 



for the strategy order (Red,l), (Blue,l), (Red,0), and (Bluc,0). This game has two pure Nash equilibria, 
(Rcd.0) and (Bluc,0), and an evolutionary unstable mixed-strategy Nash equilibrium, where the two intol- 
erant (A = 0) strategies are present with probability 1/2. 

The most interesting features appear when mutations are introduced. For example, allowing that (Red,0) 
[resp., (Blue,0)] players mutate into (Red,l) [resp., (Blue,l)] players, the population of (Red,l) strategists 
can increase in a predominantly (Red,0) state, until the minority (Blue,0) players receive the highest income 
and begin to invade the system. In the resulting population the frequency of (Blue,l) increases until (Red,0) 
dominates, thus restarting the cycle. In short, this system exhibits spontaneous oscillations in the frequency 
of tags and tolerance levels, which were found to be on e of the characteristic features in the more complicated 
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model (Riolo et al 



2001c ISigmund and Nowakl . 120011 ) . In the simplified ve rsion, however, many aspects of 

this dynamics can be investigated more quantitatively as was discussed bv lTraulsen and Schuster] ] 2003 ). 

The spatial (cellular automaton) version of the above discrete tag-tolerance model was studied bv lTraulsen and Claussen 
( 2004] ) . As was expected, the spatial version clearly demonstrated the formation of large Red and Blue do- 



mains. The spatial model allows for the occurrence of tolerant mutants in a natural way. In the cellular 
automaton rule they used, a player increased her tolerance level from A = to 1, if she was surrounded 
by players with the same tag. In this case the cyclic dominance property of the strategies gives rise to a 
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self-organizing pattern with rotating spirals and anti-spirals, which will be discussed in detail in the nex t 
Section. It is remarkable how these models sustain a poly-domain structure (jTraulsen and Schusterl . 120031 ). 
The coexistence of (Red,l) and (Blue,l) domains is stabilized by the appearance of intolerant strategies, 
(Red,0) and (Bluc,0), along the peripheries of these regions. 

The main feature of tag-based models can be recognized in the systems studied by Ahmed et al. ( 2006h 
who considered what happens if the players are grouped, and their choices and/or evolutionary rules depend 
on the group tags. Evidently, tag-based models can be extended by allowing some variations of the tag, 
which is related to the evolving community structure on the network. 



7. Rock-Scissors-Paper games 



Rock-Scissors-Papcr games exemplify those systems fro m ecology or non-equilibrium physics where three 



i cjuiubr 

state s (strategi e s or sp ecies) cyclically dominate each other (jMav and Leonard , ll975tlHofbauer and Sigmund 



1988 : Tainakal . 2001 ). Although cyclic dominance does not manifest itself directly in the payoff matri 



ces of the Prisoner's Dile mma, that model can exhibit similar properties when three stochastic reactive 
strategies are considered ( Nowak and Sigmund . 1989bl fa). Cyclic dominance can also be c aused by spa- 



tial effects as observed in som e spatial three-strategy evolutio nary Public Good games (IHauert et al 



20021: ISzabo and Hauertl. l2002bl ) and Prisoner's dilemma games ([Nowak et al 
Szabo and HauertJ . l2002ah . 



1994a; Sz abo et all . I200C ; 



Rock-Sc issors-Paper-type cycles c an be observed directly in nature. One of the best known examples, de- 
scribed bv lSinervo and Lively ( 1996f ) are the three different mating strategies of the lizard species Uta stans- 
buri ana. Similarly, the interacti on networks of many marine ecological communities involv e three-species cy- 
cles ( Johnson and Seinen . l2002h . Successional three-state systems such as spac e-grass-trees dDurrett and Levinl. 

1998 ) or forest fire model s with states: "green tree", "burning tree" , and "ash" ( Bak et al. . 1990l : Drossel and Schwab! 



1992 



Grassbergerl . 119931 ) may also have similar dynamics. For many spatial predator- prey (Lotka-Volterra) 

and "predator" states also follow cyclically each other (jBascompte and Sole . 



models the "empty site" , "p rey 



19981 : iTilman and Kareival . Il997l ) . The synchronized versions of these phenomena have been studied in the 



framework of ce l lular automaton models by sever al authors ( Greenberg and Hastings . 1978 : Fiscb . 1990l : 
ISilvertown et al. . 1992 : Fuks and Lawniczak . 2001 ). 

Cyclic dominance occur in the so-called epidemiological SIRS models. The mean-field version of the simpler 
SIR model was formulated by Lowe ll Reed and Wade Hampton Frost [their work was never published but 
is cited in ( Newman . 20021 )] and by iKermack and McKendrick ( 1927[) . The abbreviation SIR refers to the 
three possible states of individuals during the spreading of an infectious disease: "susceptible" , "infected" , 
and "removed" . In this context "removed" means to be cither recovered and become immune to further 
infection or being dead. For the more general SIRS model susceptible individuals replace removed ones with 
a certain rate representing the birth of susceptible offsprings or loosing immunity and regaining susceptible 
status after a while. 

Another biological system demonstrates how cyclic dominance occurs in the biochemical warfare among 

three types of microbes. Bacteria extract toxic substances that are v ery effective against strains of their micro- 

bial cousins who do not pr o duce the co r responding resistance factor dDurrett and Levin1.ll997tlNakamaru and Iwasa 
120001 : ICzaran et all . 120021 : iKerr et all 120021 : iKirkup and Rilevi . \2004 iNeumann and Schusterl . 120071 ). For a 
certain toxin three types of bacteria can be distinguished: the Killer type produces both the toxin and the 
corresponding resistance factor (to prevent its suicide); the Resistant produces only the resistance factor; 
and the Sensitive produces neither. Colonics of Sensitive can always be invaded and replaced by Killers. 
At the same time Killer colonies can be invaded by Rcsistants, because the latter are immune to the toxin 
but do not carry the metabolic burden of synthesizing the toxic substance. Due to their faster reproduction 
rate they achieve competitive dominance over the Killer type. Finally the cycle is closed: Sensitives have 
metabolic advantage over Resistants, because they do not even have to pay the cost of producing the resis- 
tance factor. In general cyclic dominance in biological s ystems can play two fundamental roles: supporting 
bio-diversity (May and Leonard . 19751: Kerr et al. . 20021) and providing protection against external invaders 
( Boerlijst and Hogewea . 199ll Szabo and Czaranl . 2001b ). 
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It is worth mentioning, though, that cyclic invasion processes and the resultant spiral wave structures have 
been investigated in some physical, chemical, or physiological contexts for a long time. The typical chemical 
example is the Belousov-Zhabotinski reaction ( Field and Novesl Il974h. However, similar phenomena were 
found for the Rayleigh-Bernard convection in fluid layers discussed by ( Busse and Heikesl 1980t Toral et al. , 
2000). as well a s in many other excit able media , e.g., cardiac muscl e (j Wiener and Rosenbluethl . 1 19461 ) and 



neural systems ( Hempel et al. . 19991 ). Recently [Prager et al.1 (l2003h have introduced a similar three-state 



model to consider stochastic excitable systems. 

In the following we first survey the picture that can be deduced from Monte Carlo simulations of the 
simplest spatial evolutionary Rock-Scissors-Paper games. The results will be compared with the general 
mean-field approximation at different levels. Then the consequences of some modifications (extensions) will 
be discussed. We will consider what happens if we modify the dynamics, the payoff matrix, the background, 
or the number of strategies. 



7.1. Simulations on the square lattice 



In the Rock-Scissors-Paper model introduced by Tainaka ( 1988 . 19891 . 1994 ). agents are located on the 



sites of a square lattice (with periodic boundary condition), and follow one of three possible strategies to 
be denoted shortly as R, S, and P. The evolution of the population is governed by random sequential 
invasion events between randomly chosen nearest neighbors. Nothing happens if the two agents follow the 
same strategy. For different strategies, however, the direction of invasion is determined by the rule of the 
Rock-Scissors-Paper game, that is, R invades S invades P invades R. 

Notice that this dynamical rule corresponds to the K ^ limit of Smoothed Imitation in Eqs. (64) with 
(65), if the players' utility comes from a single game with one of their neighbors. For the sake of simplicity 
we choose the following payoff matrix: 

(132) 

The above strategy adoption mechanism mimics a simple biological interaction rule between predators 
and their prey: the predator eats the prey and the predator's offspring occupies the empty site. This is why 
the present model is also called a cyclic predator-pre y (or cycl i c spat ial Lotka-Volterra) model. The payoff 
matrix (132) is equivalent to the adjacency matrix I Bollobasl 1998 ) of the three-species cyclic food web 




represented by a directed graph. 

Due to its simplicity the present model can be studied accurately by Monte Carlo simulations on a finite 
L x L box with periodic boundary conditions. Independently of the initial state the system evolves into a 
self-organizing pattern, where each species is present with a probability 1/3 on average (for a snapshot see 
Fig. 43) . This pattern is characterized by perpetual cyclic invasions and the average velocity of the invasion 
fronts is approximately one lattice unit per Monte Carlo step. 

Some features of the self-organizing pattern can be described by traditional concepts. For example, the 
equal-time two-site correlation function is defined as 

c'M^E'^W'^y + ^-vs, ( 133 ) 

y 

where S(s\, S2) is the Kronecker delta, and the summation runs over all the y sites. Figure 44 shows C(x) as 
obtained by averaging over 2 ■ 10 4 independent patterns with a linear size L = 4000. The log-lin plot clearly 
indicates that the correlation function vanishes exponentially in the asymptotic region, C(x) oc e -2 ^, where 
the numerical fit yields £ = 2.8(2). Above x f=a 30 the statistical error becomes comparable to |C(x)|. 

Each species on the snapshot in Fig. 43 will be invaded sooner or later by its predator. Using Monte Carlo 
simulations one can easily determine the average survival probability of individuals Si (t) that defines those 
portion of sites where the individuals remain alive from a time to to to + t in the stationary state. The 
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Fig. 43. Typical spatial distribution of the three strategies on a block of 100 X 100 sites of a square lattice. 
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Fig. 44. Correlation function vs. horizontal (or vertical) distance x for the spatial distribution plotted in Fig. 43. 

numerical results are consistent with an exponential decrease in the asymptotic region, Si{t) ~ e~*/ r * with 
* = 1-9(1). 

Ravasz et al.l ( 2004 ) studied a case when each newborn predator inherits the family name of its parent. 



One can study how the number of surviving families decreases while their average size increases. The survival 
probability of families Sf(t) gives the portion of surviving families (with at least one represen tative) at time 



t+tp if at time to all the species had different family names. The numerical simulation revealed (jRavasz et al 
l2004h that the survival probability of families varies as Sf( t) ~ hi(t)/t in th e asymp totic limit t » r,:, which 



is a typical behavior for the two-dimensional voter model (jBen-Naim et al 1 11996a!) . This behavior is a direct 
consequence of the fact that for large times the confronting species (along the fami ly boun d aries) face their 
predator or prey with equal probability like in the voter model [for a textbook see iLiggettl (|l985T )] . 

Starting from an in itial state of parallel stripes or concentric rings, the interfacial roughening of invasion 
fronts was studied by Provata and Tsekourasl ( 20031 ). As expected, the propagating fronts show dynamical 
characteristics that are similar to those of the Eden growth model. This means that smooth interfaces 
become more and more irregular, and finally the pattern develops into a random domain structure as 



s:-i 



plotted in Fig. 43. 

Due to cyclic dominance, at any site of the lattice the species follow cyclically each other. Short-range 
interactions with noisy dynamics are not able to fully synchronize these oscillations. On the square lattice 
Monte Carlo simulations show damping oscillations of the strategy concentrations. If the system is started 
from an asymmetric initial state it spirals towards the symmetric state as is illustrated on the strategy 
simplex (ternary diagram) in Fig. 45. The symmetric state is asymptotically stable. 
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Fig. 45. Trajectory of evolution during a Monte Carlo simulation for N = 1000 2 agents when the system is started from an 
asymmetric initial state indicated by a plus symbol. The seemingly smooth trajectory is decorated with noise, but the noise 
amplitude is comparable to the line width at the given system size. 

For small system sizes (e.g., L = 10) the simulations show significantly different behavior, because oc- 
casionally one of the species becomes extinct. Then the system evolves into one of the three homogeneous 
states, and further evolution halts. The probability to reach one of these absorbing states decreases very fast 
as the system size increases. 



This model was also investigated on a one-dimensional cha in (jTainakal . Il988l ; iFrachebourg et all Il996af ) 



and on a three-dimensional cubic lattice ( Tainakal . Il989t Il994h . The corresponding results will be discussed 
later on. First, however, we survey the prediction of the traditional mean-field approximation, whose result 
is independent of the spatial dimension d. Subsequently we will consider the predictions of the pair and the 
four-site approximations. 



7.2. Mean-field approximation 



Within the framework of the traditional mean-field approximation the system is characterized by the 
concentration of the three strategies. These quantities are directly related to the one-site configurational 
probabilities pi(s), where for later convenience we use the notation s = 1, 2, and 3 for the three strategics 
R, 5*, and P, respectively. (We neglect the explicit notation of time dependence.) These time-dependent 
quantities satisfy the corresponding approximate mean value equation, Eq. (80), i.e., 

p 1 (l)=JJi(l)bi(2)-J»i(3)] , 
p 1 (2)=pi(2)[pi(3)-p 1 (l)] , 

p 1 (3)=pi(3)[pi(l)-pi(2)]. (134) 

The summation of these equations yields pi(l) +pi(2) +pi(3) = 0, i.e., the present evolutionary rule leaves 
the sum of the concentrations unchanged, in agreement with the condition of normalization ^2 s Pi(s) = 1. 
It is also easy to see that Eq. (134) gives dJ2 s ln[pi(s)]/di = ^2 s Pi(s)/pi(s) = 0' hence 

m = pi(l)pi(2)pi(3) = constant (135) 

is another constant of motion. 
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The equations of motion (134) have some trivial stationary solutions. The symmetric (central) distribution 
Qi = Q2 = Q3 = 1/3 is invariant. Furthermore, Eqs. (134) have three additional homogeneous solutions: 
Pi(l) = 1, Pi (2) = Pi (3) = 0, and two others obtained by cyclic permutations of the indices. 

The homogeneous stationary solutions are unstable. Small perturbations are not damped, except for the 
unlikely situation when only prey are added to a homogeneous state of their predators. Under generic 
perturbations the homogeneous state is completely invaded by the corresponding predator strategy. On 
the other hand, perturbing the central solution the system exhibits a periodic oscillation, that is pi(s) = 
1/3 + esm(ut + 2s7r/3) with ui = for s = 1, 2, and 3 in the limit e — > 0. This periodic oscillation 

becomes more and more anharmonic for increasing initial perturbation amplitude, and finally the trajectory 
approaches the edges of the triangle on the strategy simplex as shown in Fig. 46. The symmetric solution is 
stable but not asymptotically stable. 
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Fig. 46. Concentric orbits predicted by the mean-field approximation around the central point. 

Notice the contradiction between the results of simulations and the prediction of the mean-field approxi- 
mation which lacks the asymptotically stable solution. Contrary to our naive expectation, this contradiction 
becomes even more explicit at the level of the pair approximation. 

The stochastic version of the three-species cyclic Lotka-Volterra system is studied by iReichenbach et al. 
(|2006al lbh within the formalism of an urn model which allows the investigation of finite size effects. It is 
found that fluctuations around the deterministic trajectories grow in time until two species become extinct. 
The extinction probability depends on the rescaled time t/N. 



7.3. Pair approximation and beyond 



In the pair approximation (jTainakal . 1 1994 ISato et all . Il997t ). the syste m is characterized by the proba- 
bilities p 2 (si,s 2 ) of all strategy pairs (si,s 2 ) on two neighboring sites (si,S2 = 1,2,3). These quantities 
are directly related to the one-site probabilities, pi(si), through the compatibility conditions discussed in 
Appendix C. Neglecting technical details, now we discuss the results on the d-dimensional hyper-cubic lattice. 

The equations of motion for the one-site configuration probabilities can be expressed by p 2 (si; s 2 ) as 



Pl(l)=p 2 (l,2)-p 2 (l,3) , 
p 1 (2)=p 2 (2,3)-p 2 (2,l) , 
p 1 (3)=p 2 (3,l)-p 2 (3,2) . 



(136) 



Note that in this case the sum of the one-site configuration probabilities remains unchanged, while the 
conservation law for their product is no longer valid, in contrast with the prediction of the mean-field 
approximation [see Eq. (135)]. 

At the pair approximation level the time derivatives of the nine pair-configurat ion probabilities, jP 2 (si 1 s 2 
(si, S2 = 1, 2, 3), satisfy nine coupled first-order nonlinear differential equations ( Tainakal . 1994 : Sato et al, 
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19971 ). For d > 1 numerical integration confirms that the dynamics tends to a state satisfying rotation and 
reflection symmetries, P2(s\, s 2 ) = P2(s2, si), independently of the initial state. 

The equations have four trivial stationary solutions, the same as obtained in the mean-field approxima- 
tion. In one of these solutions J?i(l) = 1 and p 2 (l, 1) = 1, and the other one- and two-site configuration 
probabilities are zero. For the non-trivial symmetric solution p-\(l) = Pi ( 2) = j?i(3) = 1/3, and the pair 
configuration probabilities turn out to be (jTainakal . 1 19941 : ISato et all 119971 ) 



p 2 (l,l)=p 2 (2,2)=f> 2 (3,3) 
p 2 (l,2) = ---=p 2 (3,2) 



2d + 1 
" 9{2d- 1) ' 
2d -2 



9(2d- 1) 



(137) 



In the limit d — ► 00 this solution reproduces the prediction of the mean-field approximation, p 2 (si,s 2 ) = 
Pi(si)pi( s 2) = 1/9. In contrast with this, in d = 1 the pair approximation predicts a vanishing probability 
for finding two different species on neighboring sites. This is due to a domain growth process, to be discussed 
in Sec. 7.4. 

On the square lattice (d = 2) the pair approximation predicts p 2 2p \l,l) = 5/27 ~ 0.185, which is 
significantly lower than the Monte Carlo result, p 2 C (1, 1) = 0.24595. In Appendix C we show how to 
deduce a correlation length from the pair configuration probabilities for two-state systems [see Eq. (C.10)]. 
Using this technique we can determine the correlation length for the corresponding autocorrelation function, 



£(2p) 



1 



ln(§[p 2 (l,l)- Pl (l)Pi(l)]) 



(138) 



Substituting the solution (137) into Eq. (138) we find £ (2p) = l/ln3 = 0.91 for d = 2, which is again 
significantly less than the simulation result. 

Fo r d = 2 the numerical integration gives increasing oscillations in the strategy concentrations (jTainaka , 
1994). Figure 47 illustrates that the amplitude of these oscillations approaches a saturation value with 



an exponentially increasing time period. In practice the theoretically endless oscillations are stopped by 
fluctuations (or rounding errors in the numerical integration), and the evolution terminates in one of the 
absorbing states. A similar behavior was found for higher dimensions too, with a decreasing pitch of the 
spiral for increasing d. The general properties of these so-called heteroclinic cycles and their relevan ce to 
ecological models were describ ed and discussed in a more general context by iMav and Leonard! (|1975f ) and 
Hofbauer and Sigmund (Il988l) . 

The above numerical investigations suggest that the stationary solutions of the pair approximat ion are 
unstable for small perturbations. This conjecture was confirmed analytically by ISato et al.l (|1997l ). This 
property of the pair approximation is qualitatively different from the mean-field or the Monte Carlo results, 
where the central solution is stable or asymptotically stable. This shortcoming of the pair approx imation 
can be eliminated by using the more accurate four- and nine-site approximations (jSzabo et all 120041 ) , which 
determine the probability of all possible configurations in the corresponding block. These approaches repro- 
duce 1 qualitatively well the spiral trajectories converging towards the symmetric stationary solution, as was 
found by the Monte Carlo simulations (see Fig. 45). 



7.4. One- dimensional models 



In the one-dimensional case the boundaries separating homogeneous domains move left or right with the 
same average velocity. If two domain walls collide a domain disappears as is illustra ted in Fig. 48. At the 
same time, the average size of the domains increases. Using Monte Carlo simulations iTainakal (1988) found 
that the average number of domain walls decreases with time as n w ~ t~ a , where a ~ 0.8, contrary to the 
prediction of the pair approximati on, which suggests a = 1. We will see that this discrepancy is related to 



the formation of "superdomains" ( Frachebourg et al. . 1996al fbh 



In this system one can distinguish six types of domain walls (or kinks), whose concentration is charac- 
terized by the pair configuration probabilities p 2 (si,s 2 ) (si 7^ s 2 ). The compatibility conditions, discussed 
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Fig. 47. Time-dependence of strategy concentrations (left) as predicted by the pair approximation for d = 2. The system is 
started from an uncorrclatcd initial state with pi(l) = Pi (2) = 0.36 and pi (3) = 0.24. The corresponding trajectory spirals out 
and approaches the edges of the triplex (right). 
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Fig. 48. Time evolution of domains for random sequential update in the one-dimensional case. 



in Appendix C, yields three constraints for the domain wall concentrations, P2(si, $2) = Y) 3 P2(s2, s±) 
(si = 1, 2, and 3). Despite naive expectations these constraints allow the breaking of the spatial reflection 
symmetry in such a way that £2(1, 2) — p2(2, 1) = ^2(2, 3) — £>2(3, 2) = P2(3, 1) — ^2(1, 3). In particular cases 
the system can develop into a state, where all interfaces move either to the right [p2(l>2) = P2{2, 3) = 
P2(3, 1) = 0] or to the left [^2(2, 1) = £>2(3, 2) = pz(l, 3) = 0]. Allowing this type of symmetry breaking, now 
we recall the time-dependent solution of the pair approximation for the one-dimensional case. 

For the "symmetric" solution we assume that Pi(l) = pi(2) =pi(3) = 1/3, j»2(l, 2) =^2(2, 3) =^2(3,1) = 
pz/3, and ^2(2, 1) = P2(3,2) = p2(l,3) — p r /3, where pi and p r denote the total concentration of left and 
right moving interfaces. Under these conditions the pair approximation leads to 

Pi = -1$ - 2pip T + p 2 r , 

p r = -2p 2 r - 2p rPl + pf . (139) 
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Starting from a ran dom, uncorrelated initial st ate the time-dependence of the interface concentrations obeys 



the following form (jFrachebourg et al.l . Il996bf ) : 



pi(t) = p r (t) 



1 



3 + 3t 



(140) 



We remind the reader that this prediction does not agree with the result of the Monte Carlo simulations, 
and the correct description requires a more careful analysis. 

Figure 48 clearly s hows tha t the evolution of the domain structure is governed by two basic types of 
elementary processes (jTainaka , 1989). If two opposite moving interfaces collide they annihilate each other. 
On the other hand, the collision of two right (left) moving interfaces leads to their mutual annihilation, 
and simultaneously the creati on of a left (right) moving interface. Equations (139) can be considered as the 
corresponding rate equations ( Frachebourg et aD . Il996bl ). 

As a result of these elementary processes, one can observe the formation of supcrdomains (see Fig. 48), 
in which all the internal interfaces move to the same direction. The coarsening pattern can be characterized 
by two different length scales as illustrated by the following configuration: 



P SSSPPPPRRRR SSSS PPRRRSSS R . 



(141) 



Here C denotes the size of the superdomain and £ is the size of a classical domain. Evidently, the average 
size of domains is related directly to the total number of interfaces, i.e., L(pi + p r ) = (£). 

Inside supcrdomains the interfaces perform biased random walks. Consequently, two parallel moving in- 
terfaces can meet and annihilate each other while a new, opposite moving interface is created. This in turn 
is annihilated by a suitable neighboring interface within a short time. This domain wall dynamics is sim- 
ilar to the one-dimensional ballistic annihilation proce ss, when each particle h as a fixed average velocity 



which may be either +1 or -1 with equal probability ( Ben-Nairn et al. . 1996bl ). Using scaling arguments 



Frachebourg et al. ( 1996al lb) have shown that in these systems the superdomain and domain sizes growth 
respectively as 



and 



<r(t)>~* 



(£{t))-t 3/i - 



(142) 



(143) 



Monte Carlo simulations have confirmed the above theoretical predictions. Frachebourg et al. ( 1996bl ) have 
observed that the average domain size increases approximately as (i?(i))( MC ) ~ t a with a ~ 0.79, while the 
local slope a(t) = d\n£(t)/d\nt approaches the asymptotic value a = 3/4. This is an interesting result, 
because separately both the diffusion-controlled and the ballistic controlled annihilation processes yield the 
same coarsening exponent 1/2, whereas their combination gives a higher value, 3/4. 



7.5. Global oscillations on some structures 



The pair approximation cannot handle short loops, which characterize all e?-dimcnsional spatial structures. 
It is expected, however, that on tree-like structures, e.g., on the Bcthc lattice, the qualitative prediction of 
the pair approximation can be valid. The Rock-S cissors-Paper game on the Bethe lattice with a coordination 
number z was investigated bv lSato et all (|l997h using the pair approximation. Their result is equivalent to 
those discussed above when substituting z for 2d in Eq. (137). Unfortunately, the validity of these analytical 
results cannot be confirmed directly by Monte Carlo simulations, because of boundary effects present in any 
finite system. It is expected, however, that the growing spiral trajectories seen Fig. 47 can be observed on 
random regular graphs for sufficiently large N. The Monte Carlo simulations (jSzolnoki and Szabol . l2004bl ) 
have confirmed this analytical result on random regular graphs with z = 6 (and similar behavior is expected 
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Fig. 49. Emergence of global oscillations in the strategy concentrations for the Rock-Scissors-Paper game on random regular 
graphs with 2 = 3. A Monte Carlo simulation for N = 3 ■ 10 6 sites shows that the system evolves from an uncorrelated initial 
state (denoted by the plus sign) towards a limit cycle (thick orbit). The dashed orbit indicates the limit cycle predicted by the 
generalized mean-field method at the six-site approximation level. The corresponding six-site cluster is indicated on the right. 



for z > 6, too). However, for z = 3 and 4 the evolution tends towards a limit cycle as demonstrated in 
Fig. 49. 

For 2 = 3 the generalized mean-field analysis can also be performed with four- and six-site clusters. The 
four-site approximation gives growing spiral trajectories, whereas the six-site approximation predicts a limit 
cycle with good quantitative agrement with the Monte Carlo simulation (see the thick solid and dashed lines 
in Fig. 49). It is conjectured that the growth of the global oscillation is blocked by fluctuations whose role 
increases with decreasing z. 

Different quantities can be introduced to characterize the extension of limit cycles. The simplest quantity is 
the amplitude, i.e., the difference between the maximum and minimum values of the strategy concentrations. 
However, this quantity is strongly affected by fluctuations, whose magnitude depends on the size of the 
system. There are other ways to quantify limit cycles, which are less sensitive to noise and to system size 
effects. For example, the limit cycle can be characterized by its average distance from the center. Besides 
it, one can define an order parameter $ as the average relative area of the limit cycle on the ternary phase 
diagram (see Fig. 49), compared to its maximum possible value (the area of the triangle). According to 
Monte Carlo result s on random regular graphs, <j)( RRG ) = 0.750(5) and 0.980(1) for z = 3 and 4, respectively 
(|Szab6 et all |2004 ISzolnoki and Szabdl . [2004ah . 

An alternative definition of the order parameter can be based on the constant of motion in the mean- 
field approximation Eq. (135), i.e., $' = 1 — 3 3 (pi(l)pi(2)pi(3)). It was found that the product of strategy 
concentrations only exhibits a weak periodic oscillation along the limit cycle (the amplitude is about a few 
percent of the average value). The order parameters $ and $' can be easily calculated cither in simulations 
or in the generalized mean-field approximation. 

Evidently, both order parameters become zero on the square lattice and 1 on the random regular graph 
with a degree of z = 6 and above. Thus, using these order parameters we can analyze quantitatively the 
transitions between the above mentioned three different sort of behaviors. 

Now we present a simple model exhibiting two subsequent phase transitions when an adequate control 
parameter r i s tune d. The parameter r characterizes the temporal randomness in the connectivity structure 
(jSzabo et all 120041 ) . For r = the model is equivalent to the spatial Rock-Scissors-Papcr game on the square 



lattice. A randomly chosen player adopts one of the randomly chosen neighbor's strategy provided that the 
latter is dominant. With probability < r < 1 a standard (nearest) neighbor is replaced by a randomly 
chosen other player in the system, and irrespectively of their distance a strategy adoption event occurs. 
When r = 1, partners are chosen uniformly randomly, and the mean-field approximation becomes exact for 
sufficiently large N. 

The results of the Monte Carlo simulation in Fig. 50 show three qualitatively different behaviors. For weak 
temporal randomness [r < r\ = 0.020(1)] the spatial distribution of the strategies remains similar to the 
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snapshots observed for r = (see Fig. 43). On the sites of the lattice the three strategies cyclically follow 
each other in a local sense. The short range interactions are not able to synchronize these local oscillations. 
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Fig. 50. Monte Carlo results for <t>, the relative area of the limit cycle, as a function of r, the probability of choosing random 
co-players instead of nearest neighbors on the square lattice. 



In the region r\ < r < r2 = 0.170(1) global oscillations occur. Finally for r > ra, following a growing spiral 
trajectory, the (finite) system evolves into one of the three homogeneous states and remains there forever. 
The numerical values of n and r-i are surprisingly low, and indicate high sensitivity to the introduction of 
temporal randomness. Similar transitions can be observed for quenched randomness in r andom regular small- 
world structures . In this case, however, the corresponding threshold values are higher ( Szolnoki and Szabdl . 



2004ail . 



Th e appearance of the global oscillation is a Hopf bifur cati on (jrlofbauer and Sigmundl . 119981: iKuznetsovl . 
1995 ). as is indicated by the linear increase of <!> above r\ 

When r — > r-2 the order parameter <E> goes to 1 very smoothly. The Monte Carlo data are consistent with a 
power-law behavior: 1 — $ ~ (r2 — r) 7 where 7 = 3.3(4). This power-law behavior seems to be a very robust 
feature of three-state systems with cyclic symmetries, because similar exponents were found for several other 
(regular) connectivity structures. It is worth noting that the generalized mean-field technique at the six-site 
approximation level (the corresponding clust e r is sho wn in Fig. 49) reproduces this behavior on the Bcthc 
lattice with degree z = 3 lSzolnoki and Szabo (2004a). 

The generalization of this model on other non-regular (quenched or temporal) connectivity structures is 
not straightforward. Different rules can be postulated to handle the variations in the degree. For example, 
within an elementary invasion process one can choose a site and one of its neighbors randomly, or one can 
select one of the edges of the connectivity graph with eq ual probability (the latter case favors the selection 
of sites with larger degree). iMasuda and Konno ( 2006t ) have studied a case when the randomly chosen 
site (a potential predator) invades all possible prey sites in its neighborhood. The simulations indicated 
clearly that the system develops into the symmetric stationary state on both Erdos-Renyi and scale-free 
(Barabasi-Albcrt) networks. In the light of these results it would be interesting to see what happens on 
other non-regular networks when using different evolutionary rules, and to clarify the relevant ingredients 
affecting the final stationary state. 



14 Note that in a Hopf bifurcation the amplitude increases by a square-root law above the transition point. However, in our 
case the order parameter is the area <& which is quadratic in the amplitude. 
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7.6. Rotating spiral arms 



The self-organizing spatiotemporal patterns of Rock-Scissors-Paper game on the square lattice are char- 
acterized by moving invasion fronts. Let us recall first the relevant topological difference between three- and 
two-color patterns. On the continuous plane two-color patterns are topologically equivalent to a structure of 
"islands in lakes in islands in lakes in ..." , at least if four-edge vortices are neglected. In fact the probability 
of these four-edge vertices vanishes both in the continuum limit and also in the advanced states of the 
domain growth process for smooth interfaces. Furthermore, on the triangular lattice four-edges vortices are 
forbidden topologically. In contrast with this, three-edge vertices are distinctive points where three domains 
(or domain walls) meet on a three-color pattern. Due to cyclic dominance among the three states, the edges 
of these objects rotate clockwise or anti-clockwise. We will call these structures vortices and anti-vortices, 
respectively. 

The three edges of the vortices and anti-vortices form spiral arms, because their average normal velocity 
is approximately constant. At the same time, the deterministic part of the interface motion is strongly 
decorated by noise, and the resultant irregularity is able to suppress the "deterministic" features of these 
rotating spiral arms. 

Rotating spiral arms become nicely visible in models, where the motion of the invasion fronts is affected by 
some surface tension. As the schematic three-color pattern in the right panel of Fig. 51 indicates vortices and 
anti- vortices alternate each other along interfaces. In other words, a vortex is linked to three (not necessary 
different) anti-vortices by its arms, and vice ver sa. It also implies t hat vortices and anti- vortices are created 



or annihilated in pairs during interface motion ( Szabo et al. . 19991 ) 




4* 




Fig. 51. The left snapshot shows a typical rotating spiral structure of the three strategies on a block of 500 X 500 sites of a 
square lattice. The invasion rate in (144) is n = 4/3, and e = 0.05. The schematic plot (right) illustrates how vortices (black 
dots) and anti-vortices (white dots) are positioned in the three-color model. A vortex and anti-vortex can be connected to each 
other by one, two, or three edges. 

Smooth interfaces can be observed for dynamical rules that suppress the probability of elementary pro- 
cesses trying to increase t he length of an interface. This requirement can be fulfilled by a simple model 
(jSzabo and Szolnokil . 120021 ) . where the probability of nearest neighbor (y € Q x ) invasions is 



P(s x 



1 + exp[ft AEp + e A(s x , s y )] 



(144) 



Cyclic dominance is built into the model through the payoff matrix (132) with a magnitude denoted by e 
and k characterizing how interfacial energy is taken into account. The total length of the interface is defined 
by the Potts energy (jWul . Il982h . 



e p = 

(x,y) 



(145) 
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where the summation runs over all nearest neighbor pairs and S(s, s') is the Kronecker delta. AEp denotes 
the difference in the interface length between the final and the initial states. Notice that this rule enhances 
(suppresses) those elementary invasion events which decrease (increase) the total length of the interface. It 
also inhibits the creation of new islands inside homogeneous domains. For k > 2 the formation of horizontal 
and vertical interfaces is favored (as for the Potts model at low temperatures during the domain growth 
process) due to the anisotropic intcrfacial energy on the square lattice. However, this undesired effect becomes 
irrelevant in the region k < 4/3 (see left snapshot of Fig. 51), where our analysis mostly concentrated. 

For k — 0, the interfacial energy does not play a role, and the model reproduces a version of the 
Rock-Scissors-Paper game, in which the probability of the direct invasion proce ss is reduced, P = [1 



tanh(e/2)]/2 < 1, and the inverse process is also allowed with probability 1 — P. iTainaka and Itohl (|199lh 
demonstrated that when P — > 1/2 (or e — > 0) this model becomes eq uivalent to the three-state voter model 
(|Liggettl . Il985l ; IClifford and Sudburyl. ll973HHollev and Liggett! . Il975l ). The behavior of the voter model de- 



pends on the spatial dimension d (|Ben-Naim et all [l996aj). In two dimensions a very slow (logarithmic) 
domain coarsening occurs for P — 1/2. In the presence of cyclic dominance (e > or P > 1/2), however, 
the domain growth stops at a "typical size" t hat diverges as P — > 1/2. The mean- field aspects of this system 
were considered by Ifti and Bergersen ( 20031 ). 

In order to quantify the divergence of the typical length scale ITainaka and Itoh (1991) considered the 
average concentration of vortices and found a power-law behavior, p v ~ \P — l/2\@ ~ \e\P. The nu merical fit 
to the Monte Carlo data predicted (3 = 0.4. Simulations performed bv lSzabo and Szolnoki ( 20021 ) on larger 
systems indicate a decrease in (3 for smaller e. Apparently, the Monte Carlo data in Fig. 52 are consistent 
with p = 1/4. Unfortunately this conjecture is not yet supported by theoretical arguments, and a more 
rigorous numerical confirmation is hindered by the fast increase of the relaxation time as e — ► 0. 
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Fig. 52. Log-log plot of the concentration of vortices as a function of e for ft = 1 (squares), 1/4 (triangles), 1/16 (pluses), 1/64 
(diamonds), and (circles). The solid (dashed) line represents a power-law behavior with an exponent of 2 (resp., 1/4). 



Figure 52 shows a strikingly different behavior when the Potts energy is switched on, i.e., k > 0. The 
Monte Carlo data indicates a faster decrease in the concentration of vertices, tending towards a quadratic 
behavior, p v ~ e 2 , in the limit e — » for any < k < 1. 

The geometrical features of the interfaces in Fig. 51 demonstrate that these self-organizing patterns cannot 
be characterized by a single length scale [e.g., the correlation length, the average (horizontal or vertical) 
distance between the interfaces, or the inverse of Ep/2N] as it happens for traditional domain growing 
processes [for a survey on ordering phenomena see the review bv lBravl (|1994T )]. In the present case additional 
length scales can be introduced to characterize the average distance between the connected vortex-anti- 
vortex pairs or to describe the geometrical features of the (rotating) spiral arms. 
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On the square lattice the interfaces are polygons consisted of unit length elements whose relative tan- 
gential rotation can be A8 = or ±7r/2. The tangential rotation of a vortex edge can be determined by 
summi ng up the values of A9 alo ng the given edge step by step from the vortex to the connected anti- 
vortex ( Szabo and SzolnokAliool . One can also determine the total length of the interface connecting a 
vortex-anti-vortex pair, as well as the average curvature for the given polygon. The average value of these 
quantities characterize the pattern. Beside vortex-anti-vortcx edges one can observe islands, whose total 
perimeter can be evaluated in the knowledge of the Potts energy Eq. (145) and the total length of the vortex 
edges. Furthermore, vortic es can also be classified by t he number of anti- vortices, to which the given vortex 
is linked by vortex edges (jSzolnoki and Sza bl. l2004bl ). For example, the collision of two different islands 
(e.g., states 1 and 2 in the sea of state 0) creates a vortex-anti- vortex pair linked to each other by three 
common vortex edges (see Fig. 51). If the evolution of the interfaces is affected by some surface tension 
reducing the total length, then this process will decrease the portion of vortex-anti-vortex pairs linked to 
each other by more than one edge. 

Unfortunately, a geometrical analysis of the vortex edges on the square lattice requires removal of all 
four-edge vertices for the given pattern. During this manipulation the four-edge vertex is considered as an 
instantaneous object which is about to transform into a vortex-anti-vortex pair or into two non-crossing 
interfaces. In many cases the effect of this manipulatio n is negligible and the subse quent geometrical analysis 
gives useful information about the three-color pattern (jSzabo and Szolnokil . l2002r ). It turns out, for example, 
that the average tangential rotation 9 of the vortex edges increases as e decreases, if the surface tension is 
switched on. However, this quantity goes smoothly to zero when e — ► for n = (voter model limit). In this 
latter case the vortex edges do not show the characteristic features of " spiral arms" . The contribution of 
islands to the total length of interfaces is practically negligible for k > 0. Consequently, pattern formation 
is governed dominantly by the confronting rotating spiral arms. 

In the absence of surface tension (k = 0) interfacial roughening ( Provata and Tsekourasi 20031 Dornic et al. 



20011 ) plays a crucial role as is illustrated by Fig. 43. The interfaces become more and more irregular, and 



the occasional overhanging brings about a mechanism that creates new islands. Moreover, the confronting 
fragments can also enclose a homogeneous territory and create islands. Due to these mechanisms the total 
perim eter of the islands becomes comparable to the total length of the vortex edges (jSzabo and Szolnoki , 



20021 ). The resultant patterns and dynamics differ drastically from those appearing for smooth interfaces. 
Despite the huge theoretical efforts aimed to clarify the main features of spiral patterns [for a review 



see 



Cross and Hohenbergl (1993)], our current knowledge is still rather poor about the relationship betwee n 



geometrical and dynamical characteristics. Using a geometrical approach suggested bv lBrower et alJ (J1984J), 
the time evolution of an invasion front between a fixed vortex-anti-vortex pair was studied numerically by 
Meron and Pelce (1988]). Unfortunately, the deterministic behavior of a single rotating spiral front has not 



yet been compared to the geometrical parameters averaged over vortex edges in the self-organizing patterns. 
Possible geometrical parameters of interest can be the concentration of vortices (p v ), the total length of 
interfaces (or Potts energy), the total length and average tangential rotation (or curvature) of the vortex 
edges, whereas the dynamical parameters are those that describe the average normal velocity of invasion 
fronts and quantify interfacial roughening. 

On the simple cubic lattice the above evolutionary Rock-Scissors-Paper game exhibits a self-organizin g 
pattern, whose two-dimensional cross-section is similar to the one illustrated in Fig. 43 (jTainakal . I19941 ). 
The core of the vortices form directed strings in three dimensions, whose direction refers to being a vortex 
or an anti-vortex on the 2D cross-section. The core string typically closes in a ring, resembling a long thin 
toroid. The possible topological featur es of how the loops o f these vortex strings are formed and linked 
together are discussed theoretic ally by Winfree and Strogatzl ( 19841 ). The three-dimensional simulations of 
the Rock-Scissors-paper game bv lTainakal(|l994[) can serve as an illustration of the basic properties. Here it is 
worth mentioning that the investigation of evolving vortex string networks has a traditio n in different areas of 
physics from topological defects in solids to cosmic strings (jVilenkin and Sheilardl . ll994l ). Many aspects of the 
associated dynamical rules and the genera tion of entangle d string networks have a lso been extens i vely s tudied 
for superfluid turbulence [for surveys see ISchwarzl (|l982t ); iMartins et all (|200l ; iBradlev et all (|2005l )]. The 
predicted richness of possible behaviors has not yet been fully identified in cyclically dominated three-state 
games. 



93 



7.7. Cyclic dominance with different invasion rates 



The Rock-Scissors-Paper game described above remains invariant for a cyclic permutation of species 
(strategies). However, some basic features of this system can also be observed when the exact symmetry 
docs not hold. In order to demonstrate the distortion of the solution, first we study the effect of unequal 
invasion rates within the mean-field approximation. In this case the equations of motion [whose symmetric 
version was given in Eqs. (134)] take the form: 



Pl(l) =Pi(l)[wi2#i(2) - w 3 ipi(3)] 
pi{2) =pi(2)[u> 23 pi(3) - Wi 2 pi(l)] 
Pi (3) =pi(3)[w 3 ipi(l) - w 2 3Pi(2)] 



(146) 



where w s g > defines the invasion rate between the predator s and its prey s. As for the symmetric case, 
these equations have three trivial homogeneous solutions, 



Pl(l) 


= 1 , 


Pi(2) = 0, 


Pi(3) 


= 0; 


Pl(l) 


= 0, 


Pi(2) = l, 


Pi(3) 


= 0; 


Pl(l) 


= 0, 


Pi(2) = 0, 


Pi (3) 


= 1, 



(147) 

which are unstable against the attack of the corresponding predators as discussed in Section 7.2. Further- 
more, the system has another stationary solution which describes the coexistence of all three species with 
concentrations 

W31 , nS w u 



Pi(l) 



W23 
W 



Pi(2) 



W 



Pi(3) 



W 



(148) 



where W = W12 + W23 + ^31- It should be emphasized that all species survive with the above c oncentrations , 
i.e., cyclic dominance provides a way f or their coexistence in a, wide range of parameters (jGilpinl . 11975 ; 
May and Leonard . 1975 ; Tainaka . 1993t Durrett and Levin . 1997 ). 

Notice, however, the counterintuitive response to the variation of the invasion rates. Naively, one may 
expect that the predator of the largest invasion rate should be in the most beneficial position. Contrary to 
this, Eq. (148) claims that the equilibrium concentration of a given species is proportional to the invasion 
rate of its prey. In fact, this unusual behavior is a direct consequence of cyclic dominance in all three-state 
systems. 

In order to clarify this unexpected response, let us consider a simple example. Suppose that the con- 
centration of species 1 is increased externally. In this case species 1 consumes more of species 2, whose 
concentration decreases. As a result, species 2 consumes less of species 3. Finally, species 3 gets the advan- 
tage in the dynamical balance, because it has more prey and less predators. In this example the external 
support can be realized by cither increasing the corresponding invasion rate ( here W12 ) , or c onverting ran- 
domly chosen individuals into species 1. The latter situation was analyzed by iTainakal (1993), who studied 
what happens in a three-candidate (cyclic) voter model when one of the candidates is supported by the 
mass medi a during an ele ction campaign. Further biological examples arc discussed in lFrean and Abraham 
(l200ll) andlDurrettl (l2002h. Th e rigo r ous mathem atical treatment of these equations of motion was given by 
Hofbauer and Sigmundl (Il988h : [Gaol (fl999l . l2000h . 

In analogy to the symmetric case, a constant of motion can be constructed for Eqs. (146). It is easy to 
check that 



p™ 23 (lK 31 (2K 12 (3) = constant 



(149) 



This means that the system evolves along closed orbits, i.e., the concentrations return periodically to their 
initial values as illustrated in Fig 53. 

Shortly, according to the mean-field analysis the qualitative behavior of such asymmetric Rock-Scissors- 
Paper systems rem a in un changed, despite the fact that different invasion rates break the cyclic symmetry 
( Ifti and Bergersenl 120031 ). A special case of the above system with w 12 = ui 23 7^ w 3 i was analyzed by 
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Fig. 53. Concentric orbits on the ternary phase diagram if the invasion rates are 1012 = 2 and 1023 = «<3i = 1. The strategy 
concentrations in the central stationary state (indicated by the plus symbol) are shifted towards the predator of the seeming 
beneficiary of the largest invasion rate. 



Tainaka (|l995l) using the pair approximation, which confirmed the above predictions. Furthermore, the pair 



approximation predicts that trajectories spiral out (as discussed for the symmetric case) if the system starts 
from a random initial state. Evidently, due to the absence of cyclic symmetry the three absorbing states are 
reached with unequal probabilities. 

Simulations have confirmed that most of the relevant features are hardly affected by the loss of cyclic 
symmetry on spatial structures cither. The appearance of self-organizing patterns (sometimes wi th rotating 



spira l arms) were reported for m any three-state models . Examples are the fores t- fire models (|Bak et al 



19901 : Drossel and Schwab! 1992 ) and ecological models ( Durrett and Levin . 1997t ) mentioned already. Sev 



eral aspects of phase t ransitions, spatial cor relations, fluctuations, and finite size effects are studied by 
Antal and Prod (l200ll ); lMobilia et all (l2006al fbh who considered lattice Lotka-Volterra models. 

For the SIRS models the transitions / — > R and R — ► S are not affected by the neighbors, and this may be 
the reason why the mean-field a nd pair approximat i ons p redict damping oscillations on hypercubic lattices 
for a wide range of parameters (Joo and Lebowita . 120041) . At the same time, emerging global oscillations 
were reported by iKuperman and Abramsonl ( 2001 ) on small- world structures when they tuned the ratio of 
rewired links. For the latter system the amplitude of global oscillations never reached the saturation value 
(excluding finite size effects), i.e., a transition from the limit cycle to one of the absorbing states was not 
observed. Further analysis is needed to clarify the possible conditions (if any) for two subsequent phase 
transitions in these systems. 

The unusual response for invasion rate asymmetries can also be observed in three-strategy evolution- 
ary Prisoner's Dilemma games (discussed in Sec. 6.10) where cyclic dominance occurs. For example, if the 
original strategy adoption mechanism among defectors (D), cooperators (C), and " Tit-for-Tat" strategists 
(T) (recall that D beats C beats T beats D) is superimposed externally by an additional probability of 



repla cing randomly chosen T strate gists by C str a tegist s , then it is D who e ventua lly benefits (jSzabo et al 



l2000h. In voluntary Public Good jHauert et all l2002t ISzabo and Hauertl . l2002tj) or Prisoner's Dilemma 



(jSzabo and Hauertl . l2002al ) games the defector, cooperator, and loner strategies dominate cyclically each 
other. For all these models, the increase of the defector's income [i.e., the "temptation to defect" param- 
eter b in Eq. (130)] yields a decrease in the concentration of defectors, whereas the concentration of the 
corresponding predator ("Tit-for-Tat" or loner) increases, as was shown in Fig. 41. 



7.8. Cyclic dominance for Q > 3 states 



The three-state Rock-Scissors-Paper game can be generali zed straightforwardly to Q > 3 states. For in- 
stance, cyclic dominance for four strategies was discussed bv lNowak and Sigmund (2004). They considered 
a game where unconditional defectors (A11D) get replaced by the more successful Tit-for-Tat players, who 
are transformed into Generous Tit-for-Tat strategists due to environmental noise. They, in turn, get con- 
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qucrcd by unconditional c ooperators (A11C) not incurrin g the cost of inspection, but who are finally invaded 
by A11D strategists again. iTraulsen and Schuster (|2003l) faced a conceptually simil ar case when stud ied a 
simplified four-strate g y vers ion of the tag-based cooperation model studied earlier bv lRiolo et al. ( 2001 ) and 
iNowak and Sigmundl fl998h . 

In the general case Q different states (types, strategies, species) are allowed for each site (si = 1, 2, Q), 
and these states dominate each other cyclically (i.e., 1 beats 2, 2 beats 3, and finally Q beats 1). In the 
simplest case predator invasion occurs on two (randomly chosen) neighboring sites if these are occupied by 
a predator-prey pair, otherwise nothing happens. The interface is blocked between neutral pairs [e.g., (1,3) 
or (2,4)]. Evidently, the possible types of frozen (blocked) interf aces increase with Q. 



Considering this model on the d-dimensional hyper-cubic lattice lFrachebourg and Krapivskvi ()1998f ) showed 
that the spatial distribution evolves toward a frozen domain pattern if Q exceeds a ^-dependent threshold 
value. In the one-dimensional ca se growing domains can be found if Q < 5, a nd a finite system develops into 
one of the homogeneous states ( Bramson and Griffeath . 19891 ; Fisch . Il990h . D uring these domain growin g 
processes (for Q = 3 and 4) the formation of super-domains were studied by iFrachebourg et al. (1996a). 
They found that the concentrations of the moving and standing interfaces vanish algebraically with different 
exponents. 

On the one-dimensional lattice for Q > 5, the invasion front moves until it collides with another interface. 
Evidently, if two oppositely moving interfaces meet, they annihilate each other. At the same time, the 
collision of a standing and a moving interface creates a third type of interface, which is either an oppositely 
moving or a standing one, as is shown in Fig. 54. Finally, all moving interfaces vanish and domain evolution 
halts. 
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Fig. 54. Time evolution of domain structure for five-species cyclic dominance in the one-dimensional system. 

Similar fixation occurs for high er spatial dimension d if Q > Qth(d). Using the pair approximation 
Frachebourg and Krapivskvi (jl998l ) found that Qth(2) = 14 on the square lattice, and Qth{3) = 23 on the 
simple cubic lattice. In these calculations the numerical solution of the corresponding equations of motion 
were restricted to random uncorrelated initial distribution, and the following symmetries were assumed: 



Pi(l)=p 1 (2) = ---=pi(Q) = 

P2(S1,S 2 ) = P2(s 2 ,Sl) , 



1 

Q 



and 



P2(si,s 2 ) =P2(si +a, s 2 + a) , 



(150) 
(151) 

(152) 



where the state variables (e.g., si + a: 1 < a < Q) are taken modulo Q. The above predictions of the 
pair approximation for the thresh old values of Q were confirmed by extensive Monte Carlo simulations 
(jFrachebourg and Krapivskv . 1998). 



90 



For Q = 4 the mean- field approximation leads to the following equations of motion (mean-field equations): 



p 1 (l)=p 1 (l)p 1 (2)-p 1 (l)p 1 (4) , 
Pi(2)=pi(2)pi(3)-pi(2)pi(l) , 
Pi(3)=pi(3)pi(4)-p 1 (3)pi(2) , 
Pi(4)=pi(4)pi(l)-pi(4)pi(3) . 

These differential equations conserve the quantities 
Q 

£1*00 = 1 

s=l 

and 



(153) 



(154) 



JJpi(s) = m . 

s=l 

One can also derive two other (not independent) constants of motion, 

Pl(l)pi(3) =TOl 3 , 

Pi(2)pi(4) =m 2 i , 



(155) 



(156) 



where m = 7711377124. These constraints assure that the system evolves along closed orbits, i.e., the concen- 
trations return periodically to the initial state. 

Notice that the equations of motion (153) are satisfied by the usual trivial stationary solutions: the 
central solution pi(l) = Pi (2) = pi(3) = pi(4) = 1/4, and by the four homogeneous absorbing solutions, 
e.g., pi(l) = 1 and pi(2) = Pi(3) = pi(4) = 0. Beside these, Eqs. (153) have three continuous sets of 
stationary solutions, which can be parameterized by a single continuous parameter p. The first two with 
< p < 1 arc trivial solutions, 



and 



Pi(l)=p, Pi(2)=0, pi(3) = l-p, Pi(4) = 0, 



Pi(l) = 0, pi(2)=p, pi(3) = 0, Pl (4) = l-p. 



(157) 



(158) 



The third with > p > 1/2 is non-trivial (|Sato et all 120021 ) : 



Pi(l) =Pi(3) = P , 
p 1 (2)=pi(4) = --p 



(159) 



The generalization of the mean-field equations (153) is straightforward for Q > 4. In these cases the sum 
and the product of the strategy concentrations are trivial constants of motion. Evidently, the corresponding 
homogeneous and central stationary solutions exist, as well as the mixed states involving neutral species in 
arbitrary concentrations [see Eqs. (157) and (158)]. For even Q, the suitable version of (159) also remains 
valid. The difference between odd and even Q shows up in the generalization of the constraints (156). This 
generalization is only possible for Q even, and the corresponding formulae express that the product of 
concentrations for even (or odd) label species remains constant during evolution. 

In fact, all the above stationary solutions can be realized in lattice models, independently of the spatial 
dimension. The stability of these phases and their spatial competition will be studied later in Sec. 8. 

Many aspects of cyclic Q-species predator-prey systems have not been investigated yet. Like for the Rock- 
Scissors-Paper game, it would be important to understand, how the topological features of the background 
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graph can effect the emergence of global oscillations and the phenomenon of fixation in these more general 
systems. 

The emergence of rotat i ng spi rals in a four-strategy evolutionary Prisoner's Dilemma game was studied 
bv lTraulsen and Claussen ( 20041 ) using both synchronous and asynchronous updates on the square lattice. 
The four-color domain structure can be characterized by the distribution of (three- and four-edge) vortices. 
In this case, however, we have to distinguish a large number of different vortex types, not like in the thrcc- 
state system where the r e are only two types, the vortex and anti-vortex. For the classification of vortices 
Traulsen and Claussen (2004) used the concept of a topological charge. The spatial distribution of the 
vortices indicated that generally two cyclically dominated four-edge vortices split into three-edge vortices 
[with an edge se parating the odd (even) label states] . Similar splitting of four-edge vortices was observed by 



iLin et al 



when they studied the four-phase pattern of a reaction-diffusion system driven externally 
by a periodic perturbation. 

The lack of additional constants of motion for odd number of sp ecies implies a par ity effect that becomes 
more striking when we consider the effect of different invasion rates. ISato et al. I (|2002l) studied what happens 
if the invasion rate from species 1 to 2 is varied [wi2 — a in the notation of Eq. (146)], while others are 
chosen to be unity [u>23 = W34 = . . . = Wq\ = 1 for Q > 3]. Monte Carlo simulations were performed on a 
square lattice for Q = 3, 4, 5, and 6. For Q = 5 the results were similar to those described above for Q = 3, 
i.e., the modification of a is beneficial for the predator of the seemingly favored species. In other words, 
although invasion rates with a > 1 seem to provide faster spreading for species 1, its ultimate concentration 
in the stationary state is the lowest among the species, whereas its predator (species 5) is present with the 
highest concentration. At the same time the concentration of species 3 is also enhanced, and this variation is 
accompanied by a reduction in concentration for the remaining species. The magnitude of the concentration 
variations is approximately proportional to a — 1, implying reversed tendencies for a < 1. The prediction of 
the mean-field approximation for the central stationary solution, 



Pi(l)=Pi(2)=pi(4) 
Pi(3)=pi(5) 



1 



3 + 2a 



3 + 2a 



(160) 



agrees qualitatively well with the Monte Carlo dat a as demonstrate d in Fig. 55. The more general solution 
for different invasion rates is given in the paper of ISato et al.l (|2002[ ). 
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Fig. 55. Concentration of states as a function of a in the five-state model introduced bv lSato et al ] l|2002l) . The Monte Carlo 
data are denoted by circles, pluses, squares, Xs, and diamonds for the states from 1 to 5. The solid lines show the prediction 
of the mean-field theory given by Eq. (160). 
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A drastically different behavior is predicted by the mean-field approximation for even Q (Sato et al 



2002). When a > 1, the even label species die out after some transient phenomenon and finally the odd 
label species form a frozen mixed state. Conversely, if a < 1 all odd label species become extinct. Thus, 
according to the mean-field approximation, the "central" solution pi(l) = . . . = Pi{Q) — 1/Q only exists 
if a = 1 for even Q. Early Monte Carlo simulat ions, perfo r med for the discrete a values, a = 1 ± 0.2fc 
(k = 0, 1, 2, 3), seemed to confirm this prediction (jSato et all l2002n . However, repeating the simulations on 
a finer grid around a = 1, we can observe the coexistence of all species for Q = 4 as is shown in Fig. 56. The 
stationary concentration of species 1 and 3 increase monotonously above ot\ ~ 0.86, until even label species 
die out simultaneously around the second threshold value — 1.18. According to our preliminary Monte 
Carlo results both extinction processes exhibit a similar power law behavior, pi(l) — Pi (3) ~ (a — a\)^ and 
Pi(2) ~ pi(4) ~ («2 ~ ot)P , where (3 ~ 0.55(5) in the close vicinity of the thresholds. A rigorous investigation 
of the universal features of these transitions is in progress. 




Fig. 56. Concentration of states vs. a characterizing the modified invasion rate from state 1 to 2 in the four-state cyclic 
predator-prey model. The Monte Carlo data are denoted by circles, pluses, triangles, and crosses for the states labeled from 1 
to 4. 

For Q = 6 simulations indicate a very similar picture to that in Fig. 56. A remarkable difference is due 
to the fact that small islands of odd label species can remain frozen in large domains of even label species 
(e.g., species 1 survives forever within domains of species 4, and vice versa), therefore the value of pi(l), 
pi(3), and pi(5) remains finite below the first threshold value. 

A similar parity effect was described bv lKobavashi and Tainaka (1997) who considered a lattice version of 
the Lotka-Volterra model with a linear food web of Q species. In this predator-prey model species 1 invades 
2, 2 invades 3, and (Q — 1) invades Q with all rates chosen to be 1 except for the last one parameterized 
by r. On the square lattice invasions between randomly chosen neighboring sites govern evolution. It is 
assumed, furthermore, that the top predator (species 1) dies randomly with a certain probability and its 
body is directly transformed into species Q. This model was investigated by using Monte Carlo simulations 
and mean-field approximations. Although the spatial effe cts are capable to stab il ize th e coexistence of 
species through the formation of a self-organizing pattern, iKobavashi and Tainakal (jl997l ) have discovered 
fundamental differences depending on the parity of the number of species in these ecosystems. 

The most relevant feature of the observed parity law can be in terpreted as an interference phenomenon 
between direct and indirect effects ( Kobavashi and Tainakal . 1997 ). Assume that the concentration of one of 
the species is increased or decreased directly. This variation results in an opposite effect in the concentration 
of the corresponding prey and this indirect effects propagates through the cyclic food web. For even Q the 
indirect effect will strengthen the direct modification, and the corresponding positive feedback can even 
eliminate the minority species. In contrast with this, for odd Q the ecosystem exhibits an unusual response 
similar to the one described for the Rock-Scissors-Paper game with different invasion rates. 
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8. Competing Associations 



We know that in classical game theory the number of Nash equilibria can be larger than one. In general, 
the number of Nash equilibria (and ESSs in e volutionary ga mes) rapidly increases with the number of pure 
strategics Q for an arbitrary payoff matrix (jBroom , l200Ch . The situation becomes more complicated in 



spatial evolutionary games, where each player follows one of the pure strategies, and interacts with a limited 
number of co-players. Particularly for large Q, the players can only experience a limited portion of all the 
possible constellations, and this intrinsic constraint affects the evolutionary process. Monte Carlo simulations 
demonstrated that the dynamical rule (as a way of optimization) can lead to many different stationary states 
in small systems. For large spatial models, these phases occur locally in the initial transient regime, and the 
neighboring phases compete with each other along the boundary separating them. The long-run state of the 
system develops as a result of random and/or deterministic invasion processes, and may have very complex 
spatio-temporal structure. 

The large number of possible stationary states in these models was already seen in the mean-field de- 
scription. In many of these solutions several strategies are missing, and these states can be considered as 
solutions to sub-games, in which the missing strategies are excluded by default. Henceforth, the possible 
stationary solutions will be considered as "associations of species" . 

For ecological systems the relevance of complex spatio-temporal structures (IWattl. ll947h and associations 



of spe cies have been studied for a long time [for a survey and further references see iJohnson and Boerliist 



(|2002l )]. In the subsequent sections we will discuss several simple models to demonstrate the surprisingly rich 
variety of behavior that can occur for predator-prey interactions. In these systems the survival of a species 
is strongly related to the existence of a suitable association that can provide a higher stability comparing to 
other possibilities. 

8.1. A four-species cyclic predator-prey model with local mixing 

Let us first study the effect of local mixing in the four-species cyclic predator-prey model discussed in 
Section 7.8. In this model each site a; of a square lattice can be occupied by an individual belonging to one 
of four species (s x = 1, . . . , 4). The cyclic predator-prey relation is defined as above, i.e., 1 invades 2 invades 
3 invades 4 invades 1. The evolutionary process repeats the following steps: 

1) Choose two neighboring sites at random; 

2) If these sites are occupied by a predator-prey pair, the offspring of the predator occupies the prey's site 
[e.g., (1,2) ^ (1,1)]; 

3) If the sites are occupied by a neutral pair of species, they exchange their sites with probability fi < 1 [e.g., 
(1,3) -(3,1)]; 

4) Nothing happens if the states are identical. 

Starting from a random initial state these steps are repeated many times, and after a suitable thcrmaliza- 
tion time we determine the one- and two-site configuration probabilities, pi(s) and p 2 (s, s'); s, s' = 1, . . . , 4, 
respectively, by averaging over a sufficiently long sampling time. 

The value of fi characterizes the strength of local mixing. If /i = 0, the model is equivalent to those 
discussed in Section 7.8, and a typical snapshot is shown in the upper right part of Fig. 57. As described 
above, this self-organizing pattern is sustained by traveling invasion fron ts. Sim i lar sp atio-temporal pattern 
can be observed below a threshold value of mixing, fi < [i cr = 0.02662(2) ( Szabdl . 2005 ). As \i approaches the 



threshold value, the irregularity of the interfaces separating neutral species increases, and one can observe 
the formation of small islands occupied by only two (neutral) species within the well-mixed state. 

For \x < [i cr all the four species survive with the same concentration, pi(s) = 1/4, while the probability 

p np =p 2 (l,3)+p 2 (3,l)+p 2 (2,4)+p 2 (4,2) (161) 

of finding neutral pairs (np) on neighboring sites decreases as fi increases. At the same time, the Monte 
Carlo simulations indicate a decrease in the frequency of invasions. This quantity is nicely characterized by 
the probability of predator-prey pairs (ppp), defined as 
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Fig. 57. The large snapshot (L = 160) shows the formation of the associations of neutral pairs during the domain growth 
process for fi = 0.05. Cyclic dominance in the food web is indicated in the lower right plot. The smaller snapshot (upper right) 
illustrates a typical distribution of species in the absence of site exchange, fi = 0. 



Pppp =P2{1, 2) + P2 (2, 3) + P2 (3, 4) +p 2 (4, 1) 
p 2 (2,l)+p 2 (3,2)+p 2 (4,3)+p 2 (l,4) 



(162) 



The Monte Carlo data indicates a first-order phase transition at /i = /i cr . 

Above the threshold value of fi, simulations display a domain growth process, which i s analogous to the 
one described by the kinetic Ising model below its critical temperature (jGlauberl . 119631 ) . As illustrated in 
Fig. 57, there are two types of growing domains, formed from odd and even label species. These domains 
are separated by a boundary layer where the four species invades cyclically each other. This boundary layer 
serves as a "species rese rvoir" , susta ining the symmetric composition within domains. In agreement with 
theoretical expectations ( Bravl . \l994 ) , the typical domain size increases with time as l(t) ~ yft. Thus, for 
any finite size this system develops into a mono-domain state, consisting of two neutral species with equal 
concentrations. Since there are no invasion events within this state, p PPP can be considered as an order 
parameter that becomes zero in the stationary state for /i > /i cr . Due to the site exchange mechanism the 
distribution of the two neutral species evolves into an uncorrclatcd (symmetric) state [e.g., Pi(l) = Pi (3) = 
1/2 and p 2 (l, 1) = p 2 (3, 3) = p 2 (l, 3) = p 2 (3, 1) = 1/4]. 

The traditional mean-field approximation cannot account for site exchange, therefore the corresponding 
equations of motion are the same as those in Eqs. (153) of Sec. 7.8. The two- and four-site approximations 
also fail, despite the fact that they directly involve the effect of site exchange. A qualitatively correct 
prediction can be achieved when the generalized mean-field approximation is performed on a 3 x 3 cluster. 
This discrepancy may indicate the strong relevance of complex processes (represented by several consecutive 
elementary steps in time and/or strongly correlated local structures), whose correct description would require 
larger cluster sizes beyond those analyzed thus far. 

Let us recall that this system has four homogeneous stationary states, two continuous sets with two-species, 
and one symmetric stationary solution with four-species. Despite the large number of possible solutions, 
simulations only realize the symmetric four-specics solution for fi < (i cr , excepting situations when the 
system is started from some strongly correlated (almost homogeneous) initial states. The preference for 
two-species states for [i > [i cr is due to a particular feature of the food web. Namely, participants of these 
spatial associations mutually protect each other against all possible external invaders. For example, if an 
individual of species 1, who belongs to the well- mixed association of species 1 and 3, is invaded by an external 
invader of species 4, then one of its neighbors of species 3 will recapture the lost site within a short time. 
The individuals of species 1 will guard their partners of species 3 in the same way against attacks by species 
2. For this reason, such states will be called "defensive alliances" in the sequel. 
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The present model has two equivalent defensive alliances, because members of the well-mixed association 
of species 2 and 4 can also protect each other against their respective predators using the above mechanism. 
Such defensive alliances are preferred to all other states when the mixing /i exceeds a threshold value. 

The spatial competition of different associations can be visualized directly by Monte Carlo simulations, if 
the initial state is artificially set to contain large domains of two different associations. For example, one can 
study a strip-like initial structure, where parallel domains are occupied alternately by four-species (cyclic) 
and two-species (e.g., 1+3) associations. The linear variation of p PPP with time measures the speed of the 
invasion process along the boundary, and we can determine the average velocity of the invasion front. 
For \i < fj, cr the four-species assoc iation expands, but its ave rage invasion velocity vanishes linearly as the 



critical mixing rate is approached (jSzabo and Sznaiderl . 120041 ) 



Vinv ~ {p-cr — aO- ( 163 ) 

For [i > (i cr the application of this approach is less reliable. It seems to be strongly limited to those regions, 
where the spontaneous nucleation rate is very slow. The linear scaling of the average invasion velocity is 
accompanied by a divergence in the relaxation time and we need very long run times in the 

vicinity of the threshold. On the other hand, Eq. (163) can be used to improve the accuracy of fj, cr by 
deducing Vi nv from short time runs further away from (i cr . This approach becomes particularly efficient for 
situations, when the stationary state of the competing associations is not affected by the parameters (e.g., 
the mixed state is independent of fi). 

The emergence of defensive alliances seems to be very robust, and is not related to the simplicity of the 
present model. Similar phenomena were found for anot her model, where local mix ing was built in in the form 
of ra ndom (nearest- neighbor) jumps to vacant sites (jSzabo and Sznai 20041 ). Using similar dynamical 



rules Irle et al. ( 2005 ) reported the formation of two defensive alliances (consisting of species with od d or even 



labels ) in a cyclic six-species spatial predator-prey model. In a more realistic computer simulation ISznaiderl 
(2003) observed the spatial formation of defensive alliances on a continuous plane, where four species (with 



cyclic dominance) moved randomly, and created offsprings after they had catched and consumed a prey 
within a given distance. 

8.2. Defensive alliances 

In the previous section we discussed a spatial four-species cyclic predator-prey model, where two equivalent 
defensive alliances can exist. Their stability is supported by local mixing, i.e., site exchange between neutral 
pairs. The robustness of the formation of such types of defensive alliances was also confirmed for the six- 
and eight-species versions of the model. Evidently, larger Q gives rise to more stationary states that consist 
of mutually neutral species. Nevertheless, Monte Carlo simulations indicate that only two types of states 
(associations) can survive after an evolutionary competition. If the mixing probability /i is smaller then a 
Q-dcpcndcnt threshold [i cr (Q), all species can coexist by forming a self-organizing pattern characterized by 
perpetual cyclic invasions. In the opposite case, \x > /j, cr (Q), one of the two equivalent defensive alliances 
conquer the system in the end of a domain growth process. The two equivalent defensive alliances are 
composed of species with only odd and even labels, respectively. In the well-mixed phase, above ^ cr , their 
stability is provided by a similar mechanism as for Q = 4. In fact these are the only associations, which 
can protect themselves against attacks from the rest of the species. The threshold value /j, cr decreases for 
increasing Q. According to the Monte Carlo simulations, /i cr (Q = 6) = 0.0065(1) and /x cr (Q = 8) = 
0.0028(1). Numerical studies face difficulties for Q > 8, because the domain growth velocity decreases as Q 
increases. 

The formation of defensive alliances seems to require local mixing. This is the reason why these states 
become dominant if fj, exceeds a suitable threshold value. Cyclic dominance, however, may inherently imply 
another mechanism for the emergence of defensive allian ces which is not necessarily based on local mixing 
( Boerliist and Hogewee . 1991 ; Szabo and Czaran . 2001b| [ah. 



In order to see this, notice that within a self-organizing spatial pattern each individual has a finite life- 
time, because it will be consumed by its predator sooner or later. Assume now that there is an association 
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composed of cyclically dominating species (to be called a cyclic defensive alliance). An external invader s ex , 
outside the association, who is a predator of the internal species s is eliminated from the system within 
the life-time, if the internal predator of s also consumes s ex . As an example Fig. 58 shows a six-species 
predator-prey system. The food web gives rise to two cyclic defensive alliances, whose spatial competitions 
for different mixing rates is also depicted. 




Fig. 58. A six-species predator-pray model defined by the food web on the right. Members of the two possible cyclic defensive 
alliances are connected by thick dotted lines. Left panels show snapshots of the domain growth process for two different rates 
of mixing (left: fi = 0, right: /i = 0.08). 

In this s ix-species mo del each species has two predators and two prey in a way defined by the food web 
in Fig. 58 (ISzabdl . l200.5h . The odd (even) label species can build up a self-organizing structure controlled by 
cyclic invasions. The members of these associations protect cyclically each other against external invaders as 
mentioned above. In the absence of local mixing, the Monte Carlo simulations on the square lattice visualize 
how domains of these defensive alliances grow (for an intermediate state see the upper snapshot in Fig. 58). 
Eventually the system develops into one of the "mono-domain" (mono-alliance) states. 

The spatio-temporal structure of these associations seems to play a decisive role in the protection mech- 
anism, because the corresponding mean-field approach cannot describe the extinction of external invaders. 
Contrary to the simulation results, the numerical solution of the mean- field equ ations pred i cts pe riodic 
oscillations. Very recently the relevance of the spatial structure was confirmed by iKim et al. ( 2005f ). who 
demonstrated that the formation of these defensive alliances can be impeded on complex networks, where a 
sufficiently large portion of the links is rewired. 

Simulations on the square lattice show a strikingly different behavior, when we allow site exchange between 
neutral pairs (e.g., the black and white species in Fig. 58). This process favors the formation of two-species 
domains composed from a neutral pair of species. Surprisingly, these mixed states can also be considered 
as defensive alliances. Consequently, for sufficiently large mixing rate [i, one can observe the formation of 
domains of all three two-species associations (see the right snapshot in Fig. 58). 

According to the simulations a finite system evolves into one of the two cyclic three-species defensive 
alliances if \i < fj, cr = 0.0559(1). Due to the absence of neutral species the final self-organizing spatio- 
temporal structure is independent of /i, and is equivalent to the one discussed in Section 7.1. In the opposite 
case, [i > [i cr , the finite lattice evolves into an uncorrelated dis tribution of two neutral sp ecies, e.g., pi(l) = 
pi(4) = 1/2 and p 2 (l, 1) = p 2 (4, 4) = p 2 (l, 4) = p 2 (4, 1) = 1/4 (ISzabo and Sznaiderl . [jool . 

In this predator-prey system, either below or above n cr , the domain growth process is controlled by the 
random motion of boundaries separating equivalent associations. In the next section we will consider what 
happens if the coexisting associations are different and/or cyclically dominate each other. 



8.3. Cyclic dominance between associations in a six-species predator-prey model 



In order to demonstrate the rich variety of possible complex behaviors for associations, now we c onsider 
in detail a six-species spatial predator- prey model embedding the features of the previous models ( Szabd . 
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2005 ). The present system has six species, whose individuals are distributed on a square lattice in such a 



way that each site is occupied by one of the species. Each species has two predators and two prey according 
to the food web plotted in Fig. 59. The invasion rates between any predator-prey pairs are chosen to be 
unity. Choosing sufficiently large system sizes with periodic boundary condition the system is started from 
a random initial state. Evolution is controlled by sequential invasion events if two neighboring sites, chosen 
at random, arc occupied by a predator-prey pair. If the sites arc occupied by a neutral pair, they will 
exchange their positions with a probability /i characterizing the strength of local mixing. After a suitable 
thermalization time the system evolves into a state that we consider. 

Before reporting the results of the Monte Carlo simulations it will be illuminating to recall the possible 
stationary states. This system has six homogeneous states which are unstable against invasions of the 
corresponding predators. There exist three two-species states, consisting of neutral pairs of species like 
1+4, 2+6, and 3+5, with fixed composition, whose spatial distribution is frozen (becomes uncorrelatcd) 
for \x — (fx > 0). Two subsystems, composed of species 2+3+4 or 1+5+6, are equivalent to the spatial 
Rock-Scissors-Paper game. One of the four-species subsystem (1+3+4+5) is analogous to those discussed in 
Section 8.1. The existence of these states are confirmed by the mean-field approximation, which also allows 
the coexistence of all six species with the same fixed concentration, pi(s) = 1/6 for s = 1, ... ,6, or with 
oscillating concentrations. 




Fig. 59. Distribution of species in four typical stationary states obtained by simulations for values of /i given for each snapshot. 
The color of the labeled species and the predator-prey relations are defined by the food web. 

According to the simulations performed on small systems (L ~ 10), evolution ends in one of the one- or 
two-species absorbing states, and selection is hardly affected by the presence of local mixing. Unfortunately, 
the effect of system size on the transition from a random initial state into one of the absorbing (or even 
meta-stable) states is not yet considered rigorously. However, this type of investigation could be useful for 
understanding the specialization (differentiation) of biological cells. 

On sufficiently large systems the prevalence of the homogeneous state can never be observed. The ultimate 
behavior is determined uniquely by the value of [i (for snapshots sec Fig. 59). In the absence of local mixing 
(/i = 0) two species (2 and 6 or black and white on the snapshot) die out within a short time and the 
surviving species form a self-organizing pattern maintained by cyclic invasions. Wc have to emphasize that 
this subsystem can be considered as a defensive alliance, whose stability is supported by its self-organizing 
pattern in the present six-species model. For example, if an external intruder of species 2 attacks a member 
of species 3 (or 5) then the offsprings will be eliminated by their common predator 1 (or 4). 

For small \i values local mixing is not able to prevent the extinction of species 2 and 6, and the behavior of 
the resultant subsystem becomes equivalent to the cyclic four-species predator-prey model discussed above. 
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This means that the self-organizing structure is eroded by the local mixing and the well-mixed associations 
of the neutral species (1+4 or 3+5) become favored above the first threshold value, fj, c i = 0.0265, given in 
Section 8.1. Thus, for the region fi c i < /i < fi C 2 the final state of the system consists of only two neutral 
species, as shown by the snapshots in Fig. 59 for /i = 0.03. 

The velocity of the segregation process increases with (i. As a result, for /i > /i C 2 small (well-mixed) 
domains of species 1 and 4 (or 3 and 5) can form before the rest of the species (2 and 6) become extinct. 
The occurrence of these domains help minority species to survive, because both species 1 and 4 (3 and 5) 
are prey for species 6 (2). The largest snapshot, consisting of 400 x 400 lattice sites in Fig. 59 depicts the 
successive events controlling the evolution of this self-organizing pattern. In this snapshot there are only a 
few black and white sites (species 2 and 6) who invade (without hindrance) the territory occupied only by 
their two prey. Behind these invasion fronts the occupied territory is re-invaded by their predators, whose 
territory is again attacked by their respective predators, etc. 

Although, as discussed above, the food web allows the formation of many other three- and four-species 
"cyclic" patterns, the present dynamics prefers four-species cyclic defensive alliance (1+3+4+5) to any other 
cyclic associations. On longer time scales, however, the strong mixing transforms these (cyclic) four-species 
domains into two-species associations consisting of neutral species, which can be invaded again by one of 
the minority species. 

Notice that the above successive cyclic process is much more complicated than the simple invasion process 
modeled by the spatial Rock-Scissors-Paper game or even by the forest-fire model. In the present system 
two types of cycles, (1 + 3 + 4 + 5)-* (1 + 4) ^6-> (1 + 2 + 3 + 4 + 5 + 6) (1 + 3 + 4 + 5) and 
(1 + 3 + 4 + 5) (3 + 5) 2 -> (1 + 2 + 3 + 4 + 5 + 6) -> (1 + 3 + 4 + 5), are entangled. The boundaries 
between the territories of associations are fuzzy. Furthermore, transitions from an association into another 
have different time scales and characteristics depending on fi. As a result, the concentration of the six 
species varies with [i if /i C 2 < fx < /j C 3 = 0.0581(1). Within this region the concentrations of the species 2 
and 6 increase with /i monotonously (as illustrated by the Monte Carlo data in Fig. 60), and above the third 
threshold value they can form a mixed state prevailing the whole system. In fact, the well-mixed distribution 
of species 2 and 6 also satisfies the criteria of defensive alliances, because they mutually protect each other 
against the rest of the species. 
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Fig. 60. Concentration of species vs. fi. Monte Carlo data are denoted by diamonds, large circles, squares, crosses, pluses, and 
small circles for the six species labeled from 1 to 6, respectively. Arrows indicate the three transition points. 

Determination of the second transition point is made difficult by increasing fluctuations in the concen- 
trations when [i c i is approached from above. Contrary to many critical phase transitions, where fluctuation 
of the vanishing order parameter diverges, here the largest concentration fluctuations are found for the ma- 
jority species. Visualization of the spatial evolution indicates that the invasions of minority species (2 and 
6) give rise to fast variations with large fluctuations and increasing correlation lengths if — ^ /Lt C 2+- Due to 
the mentioned technical difficulties the characteristic features of this transition are not yet fully clarified. 
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It is interesting that in the given model the coexistence of all species is only possible within a limited 
range of mixing, /i C 2 < [i < fi C 3- The steady-state concentrations are determined by the cyclic invasion rates 
between the (multi-species) associations. As discussed in Section 7.8, this type of self-organizing patterns 
can be maintained for various invasion rates and mechanism, depending on the strength o f mixing. Thi s 
phenomenon is not unique as similar features were observed for another six-species model (jSzabol . 120051 ) . 
These results suggest that the stability of very complex ecological and catalytic chem ical systems can be 
enhan ced by cycles in the spatio-temporal pattern [for an early survey see the book byH igen and Schuster 
(|l979f) ]. 

The above six-species states can be viewed as composite associations, which consist of simpler associations 
dominating cyclically each other. Following this procedure one can construct a hierarchy of associations, 
which is able to sustain bio-diversity at a very high level. In the last sections the analysis has been restricted to 
those systems, where only predator-prey interactions and local mixing were permitted. We expect, however, 
that suitable extensions of the evolutionary dynamics may induce an even larger variety of behaviors. 

It is widely accepted that self-organization plays a crucial role in pre-biotical evolution. Self-organizing 
processes can prov ide a way for transform ing nonliving matter to living organisms [for a brief survey of 



recent research see iRasmussen et al.l (|2004f )] . The systematic investigation of simplified evolutionary games 
as toy models can serve as a bottom-up approach that helps us clarifying theoretically the fundamental 
phenomena, mechanisms, and environmental requirements needed for creating living systems. 



9. Conclusions and outlook 



In this article we reviewed our current understanding of evolutionary games that have become of increased 
interest to biologists, economists, and social scientists in recent years. Following a pedagogical way we 
surveyed the most relevant elements of the classic theory of games, as well as the main components of 
evolutionary game theory. One of the principal goals was to demonstrate the applicability of concepts and 
approaches originally developed in statistical physics to study many-particle systems. For this purpose 
we investigated in detail some representative games like the Prisoner's Dilemma and the Rock-Scissors- 
Paper game for different connectivity structures and evolutionary rules. The analysis of these games nicely 
illustrated the possible richness of behavior. Evolutionary games can exhibit different stationary states and 
phase transitions when fundamental parameters like the payoffs, the level of noise, the dimensionality or the 
connectivity structure are varied. 

In comparison with physical systems, evolutionary games can show more complicated behaviors which 
depend sensitively on the applied dynamic rules and the topological features of the underlying social graph. 
Usually the behavior of a physical system can be reproduced qualitatively well by the traditional mean-field 
approximation. However, the appropriate description of an evolutionary game requires more sophisticated 
techniques, taking the fine details appearing both in the model and in the configuration space into account. 
Although, extended versions of the mean- field technique are useful and are capable of providing correct results 
on a high enough level, they should frequently be complemented by time-consuming numerical calculations. 

Most of our discussion focused on infinitely large systems where the concept of phase transitions is ap- 
plicable. We have to emphasize, however, that for many realistic systems finite-size models and approaches 
(e.g., Moran process, fixation) can give more appropriate descriptions . At the same t ime the corresponding 
techniques are capable of deducing simple analytical relations [see (|NowakL l2006bh ]. The clarification of 
finite-size effects is especially needed for those systems which have two (or more) time and/or spatial scales. 

Unfortunately, systematic investigations are strongly limited by the wide variety of microscopic rules and 
the large number of parameters. For example, at the beginning many questions in social dilemmas were 
only investigated within the framework of the Prisoner's Dilemma, although simila r problems and question s 
naturally arise for the Hawk-Dove and some other games too [see the papers by iNqwak and Mavl (I1993T): 
Killingback and Doebelil (|l996l ); lHauert and Doebelil (|2004l) ; lDoebeli and HauertJ (|2005f ); ISvsi-Aho et all l|2005l ); 



Tomassini et al.l (J2006)] . The extension of research towards these models in recent years has s tarted to pro 



vide a more complete picture about the emergen ce of cooperation in social dilemmas in general (pantos et al 
2006bl : lHauert et all 12001 lOhtsuki etall . 120061 )). 
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One of the most striking features of evolutionary games is the occurrence of self-organizing patterns. These 
patterns can be characterized by additional (internal) parameters (e.g., average time period of local oscilla- 
tions, geometrical properties of the spatial patterns, etc). We hope that future efforts will reveal the detailed 
relationships between these internal properties. The appearance of self-organization is strongly related to the 
absence of detailed balance in the microscopic states. This feature can be described by introducing a "proba- 
bility current" between microscopic states. The investigation of the distrib ution of probability currents (the 
"vorticity structure") ( Schnakenberd . 19761 ; Zia and Schmittmann . 20061 ) can provide further information 
about the possible relations between microscopic mechanisms and the resulting spatial structures. 

The present review is far from being complete. Missing related topics include minority games, games with 
random payoffs, quantum games, and those systems where the payoffs come from multi-person (e.g. Public 
Good) games. About minor i ty gam es, as mentioned prev iously, the reader can find detailed discussions in 
the books by IChallet et al. I (120041) an d bv lCoolenl (|2005h Methods developed orig inally within sp i n glas s 
theory [for reviews see ( Mezard et al. . 1987 ; Gvorgyi . 200lh ] are used successfully bv lBerg and Engel ( 1998 ); 
Berg and Weigt] ( 19991 ). who studied the properties of two-player games with random payoffs in the limit 



Q — > 00. 

Many aspects of evolutionary game theory are utilized direct l y in t he developments of computational 
learning algorithms [for a survey see the paper bv lWolpert et alj (|2004f ) and further references therein]. In 
these systems the players represent algorithms, and the Darwinian selection among the different mutants 
are controlled by some utility function (payoff). Thus, this research area combines three fields: game theory, 
machine learning, and optimization theory. 

Another perspective research topic is quant um game theory which involve s the quantum superposition of 
pure strategies [for a brief overview we suggest Lee and Johnson ( 2001 . 20021 ) and further references therein] . 
Utilizing the quantum entanglement this extension may offer strategies more favorable than classical ones (for 
a nice example see the description of the Quantum Penny Flip in Appendix A. 9). As quantum entanglement 
enforces common interest, the Pareto optimum can be easily achi eved by using quantum strategies even 
for the Prisoner's Dilemma, as was discussed bv lEisert et all (|l999h . Players can avoid the dilem ma if they 



both follow quantum strategies. This two-person quantum game was experimentally realized by IDu et al 



(|2002l ) on a nuclear magnetic resonance quantum computer. It is believed that the deeper understanding of 
quantum games can inspire the development of efficient evolutionary rules for achieving the optimal global 
payoff in social dilemmas. 
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Appendix A. Games 
A.l. Battle of the Sexes 

This game is a typical (asymmetric) Coordination game. Imagine a wife and a husband who forget whether 
they have agreed to attend an opera production or a sports event in the evening. Both events are unique 
and they should decide simultaneously without communication where to go. The husband prefers the sports 
event, his wife the opera, but both prefer being together rather than alone. The payoff matrix is: 
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Wife 




Opera 


Sports 


Opera 


(1,2) 


(0,0) 


Husband 






Sports 


(0,0) 


(2,1) 



(A.l) 



No te that Battle of the Sexes is a completely different game in the biology literature ( Hofbauer and Sigmundl . 



19981 ). That game is structurally similar to Matching Pennies. 



A. 2. Chicken game 



Bertrand Russell, in his book "Common Sense and Nuclear Warfare" ( Russell . 19591) . describes the game 



as follows: "... a sport which, I am told, is practised by some youthful degenerates. This sport is called 
"Chicken!" It is played by choosing a long straight road with a white line down the middle and starting two 
very fast cars towards each other from opposite ends. Each car is expected to keep the wheels of one side 
on the white line. As they approach each other, mutual destruction becomes more and more imminent. If 
one of them swerves from the white line before the other, the other, as he passes, shouts "Chicken!" and the 
one who has swerved becomes an object of contempt...." 

Chicken is mathematically equivalent to the Hawk-Dove game and to the Snowdrift game. 



A. 3. Coordination game 

Whenever players should choose an identical action, whatever it is, to receive high payoff, we speak about 
a Coordination game. As a typical example think about competing technologies like video standards (VHS vs 
Bctamax) or storage formats (Blue-ray vs HD DVD). When firms are capable of agreeing on the technology 
to apply, market sales and thus profits are high. However, in the lack of a common technology standard the 
market is full of compatibility problems, and buyers are reluctant to purchase. Sales and profits are low for 
all producers. Ironically, the actual technology chosen has less importance, and it may well happen that the 
market coordinates on an inferior product, as was undoubtedly the case for the QWERTY keyboard system. 

The Coordination game is also the adequate mathematical metaphor behind social conventions such as 
our collective choice of the side of the road we drive on, the time zones we define, or the signs we associate 
with given meanings in our linguistic communication. 

In general, the players can have Q > 2 options, and the Nash equilibria can be achieved by agreeing 
on which one to use. For different payoffs (see Battle of the Sexes) the players can prefer different Nash 
equilibria. 



A. 4. Hawk-Dove game 

Assume a population of animals (birds or others) where individuals are equal in almost all biological 
properties except one: their aggressiveness in interactions with others. This behavioral attribute is genetically 
coded, and animals exist in two forms: the aggressive type, to be called Hawk, and the cooperative type, to 
be called Dove. Assume that each time two animals meet, they compete for a resource R that can represent 
food. When two Doves meet they simply share the resource. When two Hawks meet they fight, and one of 
them (randomly) gets seriously injured, while the other takes the resource. Finally if a Dove meets a Hawk, 
the Dove escapes without fighting, and the Hawk takes the full resource without injury. The payoff matrix 
is 
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Animal 2 



Hawk 



Animal 1 



Dove 



Hawk 



v-c v-c 



2 ' 2 
(0,V) 



Dove 

(V,0) 

V V 
~2' ~2 



(A.2) 



where C > V is the cost of injury. The game assumes darwinian evolution, in which fitness is associated 
with average payoff in the population. 

The Hawk-Dove game is mathematically equivalent to economists' Chicken and Snowdrift games. 

A. 5. Matching Pennies 

This is a 2 x 2 zero-sum matrix game. The players first determine who will be the winner for an outcome 
with the "same" or "different" sides of the coins. Then, each player conceals in her palm a penny either with 
its face up or down. The players reveal their choices simultaneously. If the pennies match (both are head or 
both are tail), the player "Same" receives one dollar from player "Different". Otherwise, player "Different" 
wins and receives one dollar from the other player. The payoff matrix is 





Different 




Head Tail 


Head 


(1,-1) (-1,1) 


Same 




Tail 


(-1,1) (1,-1) 



(A.3) 



The Nash equilibrium of this game is a mixed strategy: each player chooses heads or tails with equal 
probability. 

This game is equivalent to " Throwing Fingers" and " Odds or Evens" . 



A. 6. Minority game 

In this iV-pcrson game (N is odd) the players should choose between two options simultaneously and only 
that action is successful which is chosen by the minority. Such a situation can occur in financial markets 
where agents choose between "selling" and "buying", or in traffic when drivers choose from two possible 
roads to take. The original version of the game, suggested by I Arthur (1994), used as an example the El Farol 
bar in Santa Fe, where Irish music is only enjoyable if the bar is not too crowded, and a gents should decide 
wheth er to visit the bar or stay at home. The evolutionary version w as introduced by Challet and Zhang 
1997 ). Many aspects of the game are discussed in two recent books bv lChallet et al. ( 2004 ) and by Coolen 



2005) 



A. 7. Prisoner's Dilemma 



The story for this game was invented by the mathematician Albert Tucker in 1950, when he wished to 
illustrate the difficulty of analyzing certain kinds of games studied previously by Melvin Dresher and Merill 
Flood (scientist at RAND Corporation, Santa Monica, California). Tucker's paradox has since given rise 
to an enormous literature in areas as diverse as philosophy, biology, economics, behavioral and political 
sciences, as well as game theory itself. The story of the "Prisoner's Dilemma" is the following: 
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Two burglars are arrested after their joint burglary and held separately by the police. However, the police 
does not have sufficient proof in order to have them convicted, therefore the prosecutor visits each of them 
and offers the same deal: if one confesses (called defection in the context of game theory) and the other 
remains silent (cooperation - with the other prisoner), the silent accomplice receives a three-year sentence 
and the confessor goes free. If both stay silent then the police can only give both burglars one year for a 
minor charge. If both confess, each burglar receives a two-year sentence. 

According to the traditional notation the payoff matrix is 





Prisoner 2 




Defect Cooperate 


Defect 


(P,P) (T,S) 


Prisoner 1 




Cooperate 


(S,T) (R,R) 



(A.4) 



where P means "Punishment for mutual defection", T "Temptation to defect", S "Sucker's payoff', and R 
"Reward for mutual cooperation" . The matrix elements satisfy the following rank ordering: S < P < R < T . 
For the repeated version (Iterated Prisoner's Dilemma) usually the additional constraint, T + S < 2R, is 
also assumed. This ensures the long-run advantage of mutual cooperation against a strategy profile where 
players cooperate and defect alternatively in opposite phase. 

The reader can find further details about the history of this game in the book bv IPoundstonei ( 1992 ). 
together with many important applications. Experimental inv estigations of h uma n behavior in Prisoner's 
Dilem ma situations have been expanding since the works of Triversl (1971) and IWedekind and Milinski 
(|l996h . 



A. 8. Public Good game 



In an experimental realization of the Public Good game (|Ledvardl . Il995l : Irlauert et al1 . l2002[) . an experi- 
menter gives some money c to each of N players. The players decide independently and simultaneously how 
much to invest (if any) to a common pool. The collected sum is multiplied by a factor r ( 1 < r < N — 1) 
and is redistributed to the N players equally, independently of their individual contributions. The maximum 
total income is achieved if all players contribute maximally. In this case each player receives rc, thus the 
final payoff is (r — l)c. Players are faced with the temptation of being free-riders, i.e., to take advantage 
of the common pool without contributing to it. In other words, any individual investment is a loss for the 
player because only a portion r/N < 1 will be repaid. Consequent ly, rational pl ayers invest nothing and the 
corresponding state is known as the " Tragedy of the Commons" (jHardinl . |l968) . In the game theory litera- 
ture this game is frequently referred to as the Tragedy of the Commons, the Free Rider problem, the Social 
Dilemma, or the Multi-person Prisoner's Dilemma. The large variety of names reflects the large number of 
situations when members of a s ociety can benefit from the effor t s of others, w hile having a temptation not 
to pay the cost of these efforts ( Baumoll 1952 : Samuelson . 19541 Hardin . Il97lh . 

Evidently, if there are only two players whose choices are restricted to two options (to invest nothing or all) 
then this game becomes a Prisoner's Dilemma. Conversely, the A-person round robin Prisoner's Dilemma 
game is equivalent to a Public Good game. 



A. 9. Quantum Penny Flip 



This game was played by Captain Picard (P) and Q (two characters in the American TV series Star Trek: 
The Next Generation) on the bridge of the Starship Enterprise. Q offers his help to the crew provided that 
P can beat him. In this game first P puts a penny head up into a box, thereafter they can reverse the penny 
without seeing it (first Q, then P, and finally Q again). In this two-person zero-sum game Q wins if the 
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penny is head up when opening the box. Between two classical players this game has a Nash equilibrium 
when each player selects one of the two possibilities randomly in subsequent steps. In the present situation, 
however, Q possesses a curious capability: he can crea te a quantum superposition state of the penny (we can 
think of the quantum coin as the spin of an electron) . iMeverl ( 1999i) shows that due to this advantage Q can 
always win this game, if he puts the coin into a quantum state, i.e., an equal mixture of head (1,0) T and 
tail (0, 1) T [i.e., a(l, 0) T + 6(0, 1) T , where a and 6 are C numbers satisfying the conditions aa + bb — 1 and 
l a l = H = l/\/2]- In the next step P can only perform a classical spin reversal or leave the quantum coin 
unchanged. Q wins because the second quantum operation allows him to rotate the coin into its original 

(head up) state independently of the action of P. 

In the quantum mechanical context ( Lee and Johnson . 20021 ) the 1/2 spin has two pure states: pointing up 
or down along the z axis. The classical spin flip made (or not) by P rotates the spin along the x axis. In his 
first action Q performs a Hadamard operation creating a spin state pointing along the +x axis. This state 
cannot be affected by P so Q can easily restore the initial (spin up) state by the second (inverse Hadamard) 
operation. 



A. 10. Rock-Scissors-Paper game 

The Rock-Scissors-Papcr game is a two-person, zero-sum game with three pure strategies (items) named 
"rock" , "scissors" , and "paper" . After a synchronization procedure depending on culture, the two players 
have to choose a strategy simultaneously and show it by their hands. If the players form the same item then 
it is a tie, and the round is repeated once again. In this game each item is superior to one other, and inferior 
to the third one: rock beats scissors beat paper beats rock. In other words, rock (indicated by keeping the 
hand in a fist) crushes scissors, scissors (by extending the first two fingers and holding them apart) cut 
paper, and paper (by holding the hand fiat) covers rock. 

Several extensions of this game have been invented and are played occasionally word wide. These may 
allow further choices. For example, in the Rock-Scissors-Paper-Spock-Lizard game each item beats two others 
and is beaten by the remaining two ones, that is, scissors cut paper covers rock crushes lizard poisons Spock 
smashes scissors decapitate lizard eats paper disproves Spock vaporizes rock crushes scissors. 



A.ll. Snowdrift game 



Two drivers are trapped on opposite sides of a snowdrift. They have to choose between two options: (1) to 
get out and start shoveling (cooperate); (2) to remain in the car (defect). If both drivers are willing to shovel 
then each one has the benefit b of getting home and they share the cost c of the work, i.e., each receives 
the reward of mutual cooperation R — b — c/2. If both drivers choose defection then they do not get home 
and obtain zero benefit (P = 0). If only one of them shovels then both get home, however, the defector 
income (T = b) is not reduced by the cost of shoveling, whereas the coopcrator gets S = b — c. This 2x2 
matrix game becomes equivalent to the Prisoner's Dilemma in Eq. (A. 4) when 26 > c > b > 0. However, for 
6 > c > the payoffs generate the Snowdrift game, which is in fact equivalent to the Hawk-Dove game. Two 
different versions of the sp atial evolutionary Snowdri ft game were very recently introduced and studied by 
Hauert and Doebelil (|2004[ ) and lSvsi-Aho et all (|2005l ). 



A. 12. Stag Hunt game 

The story of the Stag Hunt game was briefly described by Rousseau in A Discourse on Inequality (1755). 
In Maurice Cranston's translation deer means stag: 

If it was a matter of hunting a deer, everyone well realized that he must remain faithfully at his post; but 
if a hare happened to pass within the reach of one of them, we cannot doubt that he would have gone off in 
pursuit of it without scruple and, having caught his own prey, he would have cared very little about having 
caused his companions to lose theirs. 



Ill 



Each hunter prefers stag to hare and hare to nothing. In the context of game theory this means that the 
highest income is reached if each hunter chooses hunting deer. The chance of a successful deer hunt increases 
with the number of hunters, and practically there is no chance of bagging a deer by oneself. At the same 
time the chance of getting a hare is independent of what others do. Consequently, for two hunters the payoffs 
can be given by a bi-matrix as 





Hunter 2 




Hare Stag 


Hare 


(1,1) (2,0) 


Hunter 1 




Stag 


(0,2) (3,3) 



The Stag Hunt game is a prototype of the social contract, it is in fact a special case of Coordination 
games. 



A. 13. Ultimatum game 



In the Ultimatum game two players have to agree on how to share a sum of money. One of the randomly 
chosen players (called proposer) makes an offer and the other (responder) can either accept or reject it. 
If the offer is accepted, the money is shared accordingly; if rejected, both players receive nothing. In the 
one-shot game rational rcspondcrs shoul d accept any positive offer, while the b est choice for the proposer 
is to offer the minimum ( positive) sum dcintis! . lioooh . In human experiments (|Giith et all Il982t iThalel 



19881 : iHenrich et all 120011 ) the majority of proposers offer 40 to 50% of the total sum, and the responders 



frequently reject offers b elow 30%. Thi s human behavior can be reproduced by several evolutiona ry versions 
of the Ultimatum game ( Nowak et al. . 200C ; Page and Nowak , 2000l : Sanchez and Cuesta . 2005f h A spatial 
version of the game was studied by Page et aTT i 2000l ). 



Appendix B. Strategies 



B.l. Pavlovian strategies 



The Pavlov strategy was introduced in the context of the Iterated Prisoner's Dilemma. By definition 
Pavlov works according to the following algorithm: "repeat your latest action if that produced one of the 
two highest possible payoffs, and switch to the other possible action, if your last round payoff was one 
of the two lowest possible payoffs" . As such Pavlov belongs to the more general class of Win-Stay-Losc- 
Shift strategics, which define a direct payoff critcrium (aspiration level) for strategy change. An alternative 
definition frequently appearing in the literature is "cooperate if and only if you and your opponent used the 
same move in the previous round" . Of course, this tra nslates into the same rule fo r the Prisoner's Dilemma. 

For a general discussion of Pavlovian strategies see Kraines and Kraines ( 1989h . 



B.2. Tit- for- Tat 



The Tit- for- Tat strategy suggested by Ana t ol Ra poport become world-wide known after winning the 
computer tournaments conducted bv lAxelrodl (1984). For the Iterated Prisoner's Dilemma this strategy 
starts by cooperating in the first step and afterwards repeats the previous decision of the opponent. 

Tit-for-Tat cooperates mutually with all so-called "nice" strategies, and it is never the first to defect. On 
long time scales the Tit-for-Tat strategy cannot be exploited, because any defection is retaliated by playing 
defection until the co-player chooses cooperation again. Then the opponent's extra income gained at the 
first defection is returned to Tit-for-Tat. At the same time, it is a forgiving strategy, because it is willing 
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to cooperate again until the next defection of its opponent. Against a Tit-for-Tat the best choice is mutual 
cooperation. In multi-agent evolut ionary Pr i soner 's Dilemma games the Tit-for-Tat strategy helps effectively 
to maintain cooperative behavior. lAxelrodl (| 19841 ) concluded that individuals should follow this strategy in 
a declarative way in all repeated decision situations analogous to the Prisoner's Dilemma. 

This deterministic strategy, however, has a drawback in noisy environments, where two Tit-for-Tat strate- 
gists may easily end up alternating cooperation and defection in opposite phase without real hope to get 
out from this deadlock. Most of the modified versions of Tit-for-Tat were introduced to eliminates or re- 
duce this shortcoming. For example, Tit for Two Tats only defects if its opponent has defected twice in a 
row; Generous (Forgiving) Tit-for-Tat cooperates with some probability even if the co-player has defected 
previously, etc. 



B.3. Win stay lose shift 



It seems that the Win-Sta y-Lose-Shift (WSLS) idea as a learning rule was originally introduced as early 
as 1911 by iThorndikei ( 1 9 1 lh - WSLS strategies use a heuristic update rule which depends on a direct payoff 
critcrium, a so-called aspiration level, dictating the agent when to change her intended action. If the payoff 
average of recent rounds is above the aspiration level the agent keeps her original action, otherwise she 
switches to a new one. As such the aspiration level distinguishes between winning and losing situations. In 
the case when there are more than one alternatives to switch for, the choice can be random. The simplest 
example of a WSLS-type strategy is Pavlov, introduced in the context of the I terated Prisoner's Dilemma . 
Pavlov was demonstrated t o be able to defeat Tit-for-Tat in noisy environments (jNowak and Sigmundl . ll993 ; 
Kraines and Kraines . 1993), thanks to its ability to correct mistakes and exploit unconditional cooperators 



better than Tit-for-Tat. 



B.4. Stochastic reactive strategies 



Decision in reactive strat egies only depend on the previous move of the opponent. As suggested by 
Nowak and Sigmund (|!989bl lah in the context of the Prisoner's Dilemma game, these strategics are char- 



acterized by two parameters, p and q (0 < p,q < 1), which denote the probability to cooperate after the 
opponent has cooperated or defected. The definition of these strategies is made complete by introducing 
a third parameter u characterizing the probability of cooperation in the first step. For many evolutionary 
rules the value of u becomes irrelevant in the stationary population of reactive strategies (u,p, q). Evidently, 
for p — q the decision is independent of the opponent's choice and the strategy (0.5,0.5,0.5) represents a 
completely random decision. The strategies (0,0,0) and (1,1,1) are equivalent to unconditional defection 
(A11D) and unconditional cooperation (A11C), respectively. Tit-for-Tat and Suspicious Tit-for-Tat can be 
represented as (1, 1, 0) and (0, 1, 0). 

The above class of reactive strategies was extended by iNowak and Sigmund (|l995l) in a way that allows 
different cooperation probabilities for all possible outcomes realized in the previous step. This strategy can 
be represented as (pi,P2,P3,P4) where pi, P2, P3, andp4 denote the cooperation probability after the players' 
previous decisions (C, C), (C,D), (D,C), and (D,D), respectively. (For noisy systems we can ignore the 
probability u of cooperating in the first step.) Evidently, some elements of this wider set of strategies can 
also be related to other strategies. For example, Tit-for-Tat corresponds to (1,0, 1,0). 



Appendix C. Generalized mean-field approximations 



Now we give a simple and concise introduction to the use of the generalized mean-field technique on such 
lattice systems where the spatial distribution of Q different states is described by a set of site variables 
denoted shortly as s x = 1, . . . , Q. For the sake of simplicity the present description will be formulated on a 
square lattice at the levels of one-, two-, and four-site approximations. 

In this approach the translation and rotation invariant states of the system are described by a set con- 
figuration probabilities on compact clusters of sites with different sizes (and forms). The time-dependence 
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of these configuration probabilities is not denoted. Thus, the one-site configuration probability pi(s±) char- 
acterizes the probability of finding state s\ for any site x of the lattice. Similarly, p 2 {s\,s 2 ) indicates the 
configuration probability of a pair of states (s\, s 2 ) on two neighboring sites independently of both the po- 
sition and direction of the two-site cluster. These quantities satisfy the compatibility conditions that obey 
simple forms if translation invariancc holds, i.e., 

S2 

= 5> 2 (S2,S1). (C.l) 

S2 

On the square lattices the quantities P4,(s\, s 2 , S3, S4) describe all the possible four-site configuration proba- 
bilities on the 2x2 clusters of sites. In this notation the horizontal and vertical pairs are (si, s 2 ), (53, S4), and 
(si,S3), (s 2 ,S4,), respectively. Evidently, these quantities are related to the pair configuration probabilities 
via the following compatibility conditions: 

P2(S1,S 2 ) = ^ Pi(si, 82,83,84) 

S3, 34 

= ^2 Pi( s 3, Si, Sl, S 2 ) 

S3,S4. 

= ^2 P4(si, S3, S 2 ,S 4 ) 

S3-S4 

= ^2 P4(s 3 ,s 4 ,si,s 2 ) . (C.2) 

S3, Si 

The above compatibility conditions [including the normalization ^2 Sl Pi{si) — 1] allow us to reduce the 
number of parameters in the description of all the possible configuration probabilities. For example, assuming 
translation and rotation symmetry in a two-state (Q = 2) system, the one- and two-site configuration 
probabilities can be characterized by two parameters, 

Pi(l) = Q , 

Pl (2) = l-g, (C.3) 

p 2 (\,l) = g 2 + q, 

p 2 {l,2) = g(l-g)-q, 
p 2 (2,l) = Q (l- Q )-q, 

p 2 (2,2) = (l-g) 2 + q, (C.4) 

where < g < 1 means the average concentration of state 1, and q denotes the deviation from the prediction 
of the mean-field approximation [— min(g 2 , (1 — g) 2 ) < q < g(l — g)]. The reader can easily check that 
under the same conditions the four-site configuration probabilities can be described by introducing three 
additional parameters. Notice that this type of parametrization reduces the number of variables we have 
to determine by solving a suitable set of equations of motion. Evidently, significantly more parameters are 
necessary for Q > 2 despite the possible additional symmetries (e.g., cyclic symmetry in the Rock-Scissors- 
Paper game) reducing the number of independent configuration probabilities. The choice of an appropriate 
parametrization can significantly simplify the calculations. 

Within the framework of the traditional mean-field theory the configuration probabilities on a given 
cluster are approximated by a product of one-site configuration probabilities, i.e., p 2 (s\, s 2 ) ~ Pi(si)pi(s2) 
[corresponding to q = in Eqs. (C.4)] and p&(si, s 2 , S3, S4) ~ Pi(si)pi(s 2 )pi(s3)pi(s4). For this approach 
the compatibility conditions remain self-con sistent and the norma lization is conserved. 

Adopting the Bayesian extension process (jGutowitz et al. I. ll98Th we can construct a better approximation 



for the configuration probabilities on large clusters by building them from configuration probabilities on 
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smaller clusters. For example, on three subsequent sites the configuration probability can be constructed 
from two two-site configuration probabilities as 

P3{Sl,S 2 ,S 3 ) ~ i—r . (C.5) 

Pl(S2) 

At the level of the pair approximation on k (k > 2) subsequent sites (positioned linearly) the corresponding 
expression obeys the following form: 

fc-i , V 

I \ ( \ TT P^\ s m s n+l) tri a\ 

Pk(Si,...,S k ) ~p 2 (si,s 2 ) [[ -, — r — . (C.6) 

n=2 PH S «J 

This equation predicts exponentially decreasing configuration probabilities for a homogeneous distribution 
on a A:-sitc block. For example. 



m .(l,...,l)~ Pl (l) 



p 2 (l,l) lfc 



Pi(l) 



Pi 



(l)e- fe ^ , (C.7) 



where £1 = — l/ln(p2(l, l)/f>i(l)) characterizes the typical size of the homogeneous domain. 
Knowing the pair configuration probabilities the autocorrelation functions can be expressed as 

C s {x)= ^2 Px+i{s,S2,...,s x -i,s)—pi(s)pi(s). (C.8) 

S 2 ,...,S ai _l 

For the parametrization given by Eqs. (C.3) and (C.4) in a two-state system this autocorrelation function 
obeys a simple form, C s (x = 1) = q. Using successive iterations one can easily derive the following relation: 



c s (x) = B (i- B ) 



= e(l - Q)e-*'* , (C.9) 



g(l - g) 
where the correlation length is 

f = -5^?V (C ' 10) 

On a 3 x 2 cluster the six-site configuration probabilities can be approximated as a product of two four-site 
configuration probabilities, 

Pi(si,S2,S4,S 5 )p4(s2,S 3 ,S 5 ,S 6 ) rr<it\ 

p 6 {S-L,...,S6) ~ r , (Cll 

P2(S2,S 5 ) 

where the configuration probability on the overlapping region of the two clusters appears in the denominator. 
The above expressions can be represented graphically as shown in Fig. C.l. In this representation the 
solid lines and squares represent two- and four-site configuration probabilities (in the nominator) with 
site variables at the connected sites (indicated by pluses), while the same quantities in the denominator 
are plotted by dashed lines. At the same time the quantities p\(s2) are denoted by closed (open) circles 
at the suitable sites if they appear in the nominator (denominator) of the corresponding products. In 
these constructions the compatibility conditions are not satisfied completely, although the normalization is 
conserved. For example, the compatibility condition is broken for the second-neighbor pair configuration 
probabilities because J2 S2 P3( s ii s 2j s 3) deviates from those predicted by Eq. (C.5), J2 S2 P2(si, 52)^2(52, S3)], 
while it remains valid for the nearest-neighbor pair configurations. 

Each elementary step in the variation of the spatial distribution of states will modify the above configu- 
ration probabilities. In the knowledge of the dynamical rules we can derive a set of the equations of motion 
that summarizes the contributions of all types of elementary steps with the suitable weight. In order to 
demonstrate the essence of the derivation of these equations now we choose a very simple model. 

Let us consider a three-species, cyclic predator-prey model (also called spatial Rock-Scissors-Paper game) 
on a square lattice where species 1 invades species 2 invades species 3 invades species 1 with the same 
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(a) 



(b) 



S, S, 



+ © ♦ 



s 4 s 5 s 6 



Fig. C.l. Graphical representation of the construction of a configuration probability from configuration probabilities on smaller 
clusters for a three-site (a) and a six-site cluster (b). 

invasion rates chosen to be unity. We assume that the evolution is governed by random sequential updates. 
More precisely, a randomly chosen species can be invaded by one of the four neighbors chosen randomly 
with a probability of 1/4. 



(a)- 



(b)« 



(c)- 



(d)- 



Fig. C.2. The possible invasion processes from S2 to si which modify the one-site configuration probabilities. 

When a predator S2 invades the site of its neighboring prey si then this process decreases the value 
of pi(si) and simultaneously increases p\(s2) with the same magnitude whose rate becomes unity with a 
suitable choice of the time scale. Due to the symmetries, all the four possible invasion processes (see Fig. C.2) 
give the same contribution to the time derivative of the one-site configuration probabilities. Consequently, 
the derivative of pi(si) with respect to time can be expressed as 

Pl(si) =- y"^2(si,Sx)r rep (si -> s x ) 

+ ^2p2(s x , Si)T Isp (s x -> s x ) (C.12) 

where r rsp (s :E — > s y ) = 1 if the species s y is the predator of species s y and r^p^ — > s y ) = otherwise. 

In fact, the above general mathematical formulae hide the simplicity of both the calculation and results 
because the corresponding three equations for s\ = 1, 2, and 3 obey the following forms: 

Pl(l)=p 2 (l,3)-p 2 (l,2) , 
Pi(2)=p2(2,l)-pa(2,3) , 

p 1 (3)=p 2 (3,2)-p a (3,l). (C.13) 

Notice that the time-derivative of the functions pi(si) depend on the two-site configuration probabilities 
P2(s\, s 2 ). For the traditional mean- field approximations the two-site configuration probabilities are approx- 
imated as f>2(si, S2) = pi(si)p\(s2) ■ This approximation yields Eqs. (134) and the solutions are discussed in 
the section 7.2. 

More accurate results can be obtained by deriving further set of equations of motion for ^2(^1,52) to 
improve the present approach. In analogy with the derivation of Eqs. (C.12) we can sum up the contributions 
of all the elementary processes (see Fig. C.3). 

Within a horizontal pair there are two types of internal invasions [processes (a) and (b) in Fig. C.3] affect- 
ing the configuration probability p 2 (si,S2)- Their contribution to the function p 2 (si,S2) is not influenced 
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Fig. C.3. The horizontal two-site configuration probabilities P2{$1, £2) are modified by the possible invasion processes indicated 
by arrows. 



by the surrounding states, and therefore it is proportional to the probability of the initial pair configu- 
ration P2(si,S2)- The contribution of the external invasions [processes (c) - (h) in Fig. C.3], however, arc 
proportional to the corresponding three-site configuration probabilities that can be constructed from pair 
configuration probabilities as described above. This approximation neglects the shape differences between 
the three-site clusters. In this case the approximative equation of motions for ^2(^1, s 2 ) can be given as 



P2(si,s 2 ) =- ^P2(si, s 2 )r rsp (s 2 -» Si) 

- -pn{sx, S2)r rsp (si — ► s 2 ) 



-S(S!, s 2 )'^2p2(si, s s )T Isp (s 3 -> Si) 

S3 

-8(S!, S 2 )^2P2(S3, S2)T Is p(s 3 -> s 2 ) 



3 
4 

3 
4 

3 
4 

3 
4 



S3 



S3 



S3 



P2(S 3 


si)p 2 (si,s 2 ) 




Pi(si) 


P2(s 1 


S2)P2(S 2 ,S 3 ) 




P1O2) 


P2(S1 


S 3 )P2(S 3 ,S 2 ) 




Pi (S3) 


P2(S1 


S3)P2(S3,S2) 


Pi (S3) 



(C.14) 



where 5(si,s 2 ) denotes the Kroneckcr delta. Here it is worth mentioning that the right hand side of Eq. 
(C.14) only depends on the pair configuration probabilities if the compatibility conditions Eq. (C.l) are 
taken into consideration. 

A more complicated set of equations of motion can be derived for those systems where the probability of 
the elementary invasion (strategy adoption) processes depends not only on the given two site variables but on 
their surroundings too. Such a situation is illustrated in Fig. C.4 where the probability of strategy adoption 
from one of the randomly chosen neighbors (e.g., S2 is substituted for si) depends on the neighborhood 
(s 3 , s 8 ). 

In this case the negative A Sl ^ S2 contribution of this elementary process to ^2(^1, S2) is 



E 



p 8 (si,...,s 8 )r(si -> s 2 ), 



(C.15) 
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Fig. C.4. The horizontal two-site configuration probabilities p2(s±, S2) are modified by the internal invasion process (indicated 
by the arrow) with a probability depending on all the labeled sites. The graphical representation of the construction of the 
corresponding eight-site configuration probabilities from pair configuration probabilities is illustrated as above. 

where some possible definitions for T(s% — > s 2 ) are discussed in Section 4.3 [for a definite expression see Eq. 
(124)]. Simultaneously, these processes give opposite (— A Sl _, S2 ) contribution to p 2 (s 2 , s 2 ). According to the 
above graphical representation, at the pair approximation level the eight-site configuration probabilities can 
be approximated as 



/ s / ,P2(si,s 3 )p 2 (s 1 ,s 5 )p 2 (si,s 7 ) 

Pg(Sl, • . .,S 8 ) ~p 2 (Sl,S 2 ) 37 r 

P2{S2, S 4 )p 2 (s 2 , S 6 )p 2 (s 2 , S 8 ) 

Pi(s 2 ) 



(C.16) 



where the contributions of the denominator are indicated by the three concentric circles at the sites s± and 
s 2 in Fig. C.4. Similar terms describe the contributions of the opposite strategy adoption (s 2 — > s±, as well 
as of those processes where an external strategy will be substituted for cither s± or s 2 (e.g., s\ — > S3). Thus, 
the complete set of the equations of motion for P2{s±, s 2 ) can be obtained by summing up the contributions 
of each possible elementary process. Despite the simplicity of the derivation of the corresponding differential 
equations, the final formulae become very lengthy even with the use of sophisticated notations. This is the 
reason why we do not display the whole formulae. It is emphasized, however, that this difficulty becomes 
irrelevant in the numerical solutions, because efficient algorithms can be developed to derive the equations 
of motion. 

For a small number of independent configuration probabilities we always have some chance to find ana- 
lytical solutions. Besides this two methods are used to solve numerically the resultant set of equations of 
motions. In the first case the differential equations are integrated numerically (with respect to time) starting 
from an initial state with configuration probabilities satisfying the symmetry conditions. This is the only 
way if the system tends to a limit cycle. In general, the initial state can be any uncorrelated state with 
arbitrary concentration of states. Special initial states can be chosen if we wish to study the competition 
between two types of ordered domains which do not have common configurations. Using this approach, we 
can study the average velocity of interfaces separating two different homogeneous states. For example, at 
the level of the pair approximation in a one-dimensional, two-state system the evolution can be started from 
a state: p 2 (l, 0) = p 2 (0, 1) = e and p 2 (0, 0) = p 2 (l, 1) = 1/2 — e, where < e << 1 means the density of 
(0, 1 and (1,0) domain walls. After some transient events the quantity p\(l)/2e characterizes the average 
velocity of the growth of large domains of state 1. 

In the second c ase the stationa ry solution(s) (p 2 (sx, s 2 )=0) can be found by using the standard Newton- 
Raphson method ( Ralstonl . Il965 ). The efficiency of this method can be increased by reducing the number 



of independent configuration probabilities by taking all symmetries into consideration. For this iteration 
technique the solution algorithm should be started from a state staying within the region of convergence. 
Sometimes it is convenient to find a solution by using the above mentioned numerical integration for a given 
values of parameters, and afterward we can repeat the Newton-Raphson method whereas parameters are 
varied very slowly. It is emphasized that this method can find the unstable solutions too. 

At the level of two-site approximation the construction of the configuration probabilities on large clusters 
neglects several pair correlations that may be important for some models (e.g., the construction represented 



118 



graphically in Fig. C.4 does not involve explicitly the pair correlations between the states S3 and S4). The 
best way to overcome this difficulty is to extend further the method. The larger the cluster size we use in 
this technique, the higher the accuracy we can achieve. Sometimes the gradual increase of the cluster size 
results in qualitative improvement in the prediction as discussed in Sees. 6.6 and 7.3. The investigation of 
the configuration probabilities on 2 x 2 (or larger) clusters becomes important for models where the local 
structure of distribution, the motion of invasion fronts, and/or the long range correlations play relevant roles. 
The corresponding set of the equations of motion can be derived via the straightforward generalization of 
the method described above. That is, we sum the contributions to the time-derivative of configuration 
probabilities coming from all the possible elementary processes. 




Fig. C.5. Strategy adoption from S2 to 
including the sites si. 



il will modify all the four four-site configuration probabilities on the 2x2 blocks 



Evidently, the contributions to Pa(si, S2, S3, S4) are determined by the strategy distributions on larger 
clusters. However, the corresponding configuration probabilities can be constructed from four-site config- 
uration probabilities as shown in Fig. C.5, where we assumed that the invasion rates are affected by the 
surrounding sites. Using this trick the quantities pi{si, S2, S3, S4) can be expressed as nonlinear functions of 
Pa{si 1 S2, S3, S4). The above symmetric construction takes explicitly into account all the relevant four-site 
correlations, while the condition of normalization for the large clusters are no longer valid. Generally this 
shortcoming of the present construction does not cause difficulties in the calculations. 

The applicability of this method is limited by the large number of different configuration probabilities 
which increases exponentially with the cluster size and also by the increasing number of terms in the corre- 
sponding equations of motion. Current computer facilities allow us to p erform such numeric a l inve stigations 
on even 3x3 clusters if the dynamical rule is not very complicated ( Szolnoki and Szabo . 2005). For the 
one-dimensional systems this method is used successfully up to cluster size 11 (jDickmanl . 120021 ) for the 
investigation of a t hree-state stoch astic sandpile model, and even longer clusters could be studied for some 
two-state systems ( Szolnoki . 2002 ). 

Knowing the configuration probabilities for the different phases one can gi ve an estima t ion fo r the velocity 
of interface separating two stationary domains. The method suggested by lEllner et all (I1998T) is based on 
the pair correlations in the confronting phases. T his approach prove d to be successful for an evolutionary 
prisoner's Dilem ma on the one-dimensional lattice ( Szabo et al.L l2000h and also for a driven lattice gas model 
(jDickmanl . kOQlf) . 

Disregarding technical difficulties this method can be e asily adapted either for other (spatial) lattice 
structures with arbitrary dimensions or for Bethc lattices ( Szabdl . 2000l ). In fact, the pair approximation 
(with a construction represented in Fig. C.4) is expected to give a better prediction on Bethe lattices, 
because here the correlation between two sites can be mediated along only a single path. 

The above description was only concentrated on the effect of asynchronous nearest-neighbor invasions 
(strategy adoptions). With suitable modifications, however, this method can be adjusted to consider other 
local dynamical rules, such as the appearance of mutants and the strategy exchange or diffusion when two 
site variables change simultaneously. 

Variants of this method can be used to stu dy stationary states (evolved from a r andom initial state) for 
deterministic or stochastic cellular automata ( Gutowitz et all 119871 ; lAtman et all l2003h . In this case one 
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should derive a set of recursion equations for the configuration probabilities at a discrete time t + 1 as a 
function of the configuration probabilities at time t. Using the above described construction of configuration 
probabilities we can obtain a finite set of equations whose stationary solution can be found analytically or 
numerically. In this case difficulties can arise for those constructions which do not conserve normalization 
(such an example is shown in Fig. C.5). 
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